diff --git a/REGEX_PATHOLOGICAL.md b/REGEX_PATHOLOGICAL.md
new file mode 100644
index 0000000..12fa5b4
--- /dev/null
+++ b/REGEX_PATHOLOGICAL.md
@@ -0,0 +1,209 @@
+# Pathological Regex — Cross-Port Discovery & Fixes
+
+> Discovery panel that runs 10 deliberately pathological regex inputs
+> against every port's `re_*` API. The first pass surfaced where port
+> wrappers misbehaved on edge cases; this document records the panel,
+> the **fixed** porting variations, and the irreconcilable
+> engine-bound differences that remain.
+
+The same 10-case panel runs in every port via the port's `re_*` API
+(see `REGEX_API.md`). Each port has a `regex_pathological*` test file
+under its own tests directory.
+
+## The panel
+
+| # | Name | Call | What it stresses |
+|---|---|---|---|
+| P1  | `redos_nested_plus`         | `re_test("^(a+)+$", "a"*22 + "!")`         | Catastrophic backtracking via nested quantifier |
+| P2  | `redos_alt_overlap`         | `re_test("^(a\|aa)+$", "a"*22 + "!")`      | Catastrophic backtracking via overlapping alternation |
+| P3  | `empty_repeat_replace`      | `re_replace("a*", "abc", "X")`             | Zero-width-match convention in `replace_all` |
+| P4  | `unicode_replace_dot`       | `re_replace("\\.", "café.au.lait", "/")`   | UTF-8 char-boundary handling |
+| P5  | `unicode_find_codepoint`    | `re_find("é", "café au lait")`             | Non-ASCII patterns |
+| P6  | `deep_nesting_compile`      | `re_test("(((…40…(a)…)))","a")`            | Parser/compiler stack |
+| P7  | `big_bounded_quantifier`    | `re_test("^a{0,10000}b$", "a"*10+"b")`     | Large bounded quantifier |
+| P8  | `invalid_pattern`           | `re_compile("[abc")`                        | Error reporting |
+| P9  | `backref_re2_forbidden`     | `re_test("^(a+)\\1$", "aaaa")`             | RE2 strictness on backrefs |
+| P10 | `find_all_zero_width`       | `re_find_all("a*", "bbb")`                 | Zero-width `find_all` enumeration |
+
+## Post-fix results (14 of 16 ports runnable in this env)
+
+| Port       | P1 (ms) | P2 (ms) | P3 result    | P4 result      | P7    | P8                | P9        |
+|------------|--------:|--------:|--------------|----------------|-------|-------------------|-----------|
+| typescript |  180    | 3       | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | matches   |
+| javascript |  179    | 3       | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | matches   |
+| python     |  191    | 4       | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | matches   |
+| ruby       |    0.04 | 0.05    | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | matches   |
+| php        |    3    | 0.3     | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | matches   |
+| perl       |    0.06 | 0.06    | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | matches   |
+| go         |    0.03 | 0.02    | `"XbXcX"`    | `café/au/lait` | PANIC | PANIC             | PANIC     |
+| rust       |    0.01 | 0.01    | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | non-match |
+| java       |   13    | 0.2     | `"XXbXcX"`   | `caf?/au/lait` | OK    | ERR (clean)       | matches   |
+| cpp        | **1190**| 24      | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | matches   |
+| c          |    0.01 | 0.01    | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (NULL return) | non-match |
+| lua        |    0.12 | 0.10    | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | non-match |
+| csharp     |  393    | 8       | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | matches   |
+| kotlin     |   24    | 0.3     | `"XXbXcX"`   | `café/au/lait` | OK    | ERR (clean)       | matches   |
+| swift      | n/r     |         |              |                |       |                   |           |
+| zig        | n/r     |         |              |                |       |                   |           |
+
+n/r = toolchain unavailable in this environment.
+
+## Fixes — porting variations resolved
+
+1. **rust — stack overflow on `a{0,10000}b$`** (`rust/src/re.rs`).
+   The Thompson engine's `add()` epsilon-closure was recursive; 10 000
+   chained `Split` instructions blew the call stack with SIGABRT.
+   Rewrote as iterative with an explicit work stack (priority preserved
+   by pushing `y` then `x`). All 15 tests still pass; the in-tree corpus
+   (1200 cases via the TS-shared spec) still passes.
+
+2. **php — `re_compile` silently accepted invalid patterns**
+   (`php/src/Struct.php`). The wrapper returned a delimited string
+   without ever running PCRE on it, and every other helper used
+   `@preg_match` to suppress warnings. Now `re_compile` issues a
+   no-op `preg_match` to surface compile errors, throws
+   `InvalidArgumentException` on failure, and the `@` is dropped from
+   the read helpers. 85 PHPUnit tests still pass.
+
+3. **c / lua — `re_replace("a*", "abc", "X")` returned `"XaXbXcX"`**
+   (`c/src/regex.c`, `lua/src/regex.lua`). The in-tree Thompson NFA
+   driver's `OP_MATCH` branch had `if (!found) { … }`, which froze
+   the first match found and prevented surviving higher-priority
+   threads from overriding at a later `sp`. That made greedy
+   quantifiers behave lazily — `a*` matched empty at every position
+   instead of consuming the leading `"a"`. Always overwriting on
+   `OP_MATCH` (within the priority-pruned thread set) makes greedy
+   `a*` consume the `"a"` correctly. C corpus 1200/1200 still passes;
+   Lua regex unit tests 53/53 still pass.
+
+4. **c — `re_find_all` missing from public header**
+   (`c/src/voxgig_struct.h`, `c/src/re_util.c`). Added
+   `vs_strvec_vec` + `vs_re_find_all` / `vs_re_find_all_re`. The
+   engine already supported the operation; only the wrapper was
+   missing.
+
+5. **zig — `re_find` / `re_find_all` / `re_replace` not exposed**
+   (`zig/src/struct.zig`, `zig/src/regex.zig`). The engine had
+   `matchAt` but only `re_compile` / `re_test` / `re_escape` were
+   public. Made `findFirst` public, added `findFrom(input, start)`,
+   and added the three wrappers using the page allocator (matching
+   the existing `re_test` style). **Not run in this environment**
+   (no zig toolchain); the wrappers compile against the engine but
+   need a host-side smoke pass.
+
+6. **perl — discovery test showed `cafÃ©/au/lait`** (`perl/t/regex_pathological.t`).
+   This turned out to be a test-script bug, not a port bug:
+   `encode_json` returns UTF-8-encoded bytes and `binmode STDOUT,
+   ':utf8'` then re-encoded them as Latin-1. Switched the test to
+   `JSON::PP->new->utf8(0)->encode` so the `:utf8` layer encodes
+   once. The Perl port's `re_replace` was correct all along.
+
+**Deliberately not fixed — Go `re_replace` zero-width convention.**
+Go's `regexp.ReplaceAllString` suppresses an empty match immediately
+after a non-empty match at the same offset, so
+`re_replace("a*", "abc", "X")` returns `"XbXcX"` here, not the
+ECMA-canonical `"XXbXcX"`. This is RE2's chosen rule — it's
+host-package behaviour we don't own. An earlier attempt wrapped
+`ReplaceAllString` with a manual emit loop to align the output; it
+was reverted in line with "don't modify inherent language regex
+variance, just document it." Callers writing portable code should
+not assume zero-width replacement semantics are identical across
+ports.
+
+## Irreconcilable — engine-bound, documented for callers
+
+Cases where the host language's regex engine fundamentally differs
+from another's. The cross-port contract documented in `REGEX.md`
+already requires patterns to live in the RE2 subset; these are the
+sharp edges that come with the host engines we don't own.
+
+1. **P1 / P2 catastrophic backtracking.** ECMA / PCRE / .NET / Java
+   regex engines use backtracking. `^(a+)+$` against 22 a's plus a
+   non-match suffix is:
+   - C++ libstdc++ `<regex>`: 1190 ms
+   - C# `System.Text.RegularExpressions`: 393 ms
+   - Python `re`: 191 ms
+   - TS/JS `RegExp`: ~180 ms
+   - Java `java.util.regex`: 13 ms
+   - Ruby (Onigmo) / Perl / PHP (PCRE+JIT): <3 ms (engine-side ReDoS mitigations)
+   - Go (RE2) / Rust (in-tree) / C / Lua (Thompson NFA): <0.1 ms (no backtracking)
+
+   The RE2-subset contract avoids the worst classes (no backrefs,
+   no lookaround), but nested quantifiers like `(a+)+` are still
+   inside the subset and can still backtrack catastrophically on
+   the non-RE2 engines. **Callers are responsible for writing
+   linear-friendly patterns** (a single `a+` would already be
+   linear on every engine here). See `REGEX.md` for the dialect.
+
+2. **P7 — RE2's bounded-quantifier limit.** Go's stdlib `regexp`
+   refuses to compile `a{0,10000}` with *"invalid repeat count"*:
+   RE2 caps `{n,m}` at 1000 to keep the compiled program size
+   bounded. Every other engine compiles it. Internal call sites
+   in the corpus stay well below the limit; user-facing `$LIKE`
+   operators should too. There is no portable workaround — RE2's
+   limit is hard-coded in the host stdlib.
+
+3. **P8 — Go panics on invalid pattern.** `ReCompile` is a
+   passthrough to `regexp.MustCompile`, which panics. This is the
+   Go-idiomatic shape and matches the throw/raise behaviour of
+   every other port; callers wrap in `recover()` the same way other
+   ports use `try/catch`. (Not a divergence in semantics — just in
+   how the failure is named.)
+
+4. **P9 — backreferences (`\1`, `(?P=name)`).** Three families:
+   - PCRE / ECMAScript / .NET / Java / Onigmo / Perl: backrefs work.
+     `^(a+)\1$` on "aaaa" matches.
+   - Go (RE2): rejects at compile time (panics).
+   - In-tree engines (Rust, C, Lua): parse `\1` as a literal "1"
+     (or similar fallback) — the pattern compiles but never matches
+     the back-reference semantically, so the test returns `false`.
+
+   `REGEX.md` already documents this: **backreferences are outside
+   the supported dialect.** None of the canonical patterns use them.
+   The `$LIKE` operator does not document them. Callers that need
+   backrefs are running outside the contract on every RE2-family
+   port.
+
+5. **P3 zero-width `replace_all` convention varies between engines.**
+   `re_replace("a*", "abc", "X")` produces:
+   - `"XXbXcX"` — every PCRE / ECMA / .NET / Java engine, plus the
+     in-tree Thompson NFA ports (Rust, C, Lua) after the engine fix.
+   - `"XbXcX"` — Go (RE2). RE2 deliberately suppresses an empty match
+     that immediately follows a non-empty match at the same offset.
+   This is inherent to RE2 / Go's `regexp` package; there is no
+   portable workaround that doesn't replace the engine. Don't rely on
+   zero-width replacement output being identical across ports.
+
+6. **Java / .NET stdout encoding.** Java printed `caf?` for P4/P5,
+   not because the regex returned the wrong string but because
+   `System.out`'s default `PrintStream` uses the platform's default
+   charset on JVMs without `-Dfile.encoding=UTF-8`. The in-memory
+   `String` is correct UTF-16. .NET's default `Console.Out` is
+   UTF-8 on .NET 6+, so C# was unaffected. This is orthogonal to
+   the regex contract.
+
+7. **Time-of-iteration variance on backtracking engines.** P1 / P2
+   numbers vary across runs depending on JIT warmup, GC, and host
+   load. The qualitative split (linear vs catastrophic) is stable;
+   the specific milliseconds aren't a regression signal.
+
+## Where the tests live
+
+| Port       | Path |
+|------------|------|
+| typescript | `typescript/test/regex_pathological.test.ts` |
+| javascript | `javascript/test/regex_pathological.test.js` |
+| python     | `python/tests/test_regex_pathological.py` |
+| ruby       | `ruby/test_regex_pathological.rb` |
+| php        | `php/tests/RegexPathologicalTest.php` |
+| perl       | `perl/t/regex_pathological.t` |
+| go         | `go/regex_pathological_test.go` |
+| rust       | `rust/tests/regex_pathological.rs` |
+| java       | `java/src/test/RegexPathologicalTest.java` |
+| cpp        | `cpp/tests/regex_pathological.cpp` |
+| c          | `c/tests/regex_pathological.c` |
+| lua        | `lua/test/regex_pathological.lua` |
+| csharp     | `csharp/tests/RegexPathologicalTest.cs` |
+| kotlin     | `kotlin/src/test/kotlin/voxgig/struct/RegexPathologicalTest.kt` |
+| swift      | `swift/Tests/VoxgigStructTests/RegexPathologicalTests.swift` |
+| zig        | `zig/test/regex_pathological.zig` |
diff --git a/c/README.md b/c/README.md
index 1e3f2b9..d540b44 100644
--- a/c/README.md
+++ b/c/README.md
@@ -225,6 +225,63 @@ operator uses substring containment instead of full regex matching
 kept out of scope to minimise dependencies).
 
 
+## Regex
+
+Uniform regex API (see `/REGEX_API.md`). The C port **ships its own
+RE2-subset Thompson NFA engine** in `src/regex.c` (~700 LOC) — no
+external dependency. The wrapper layer (`src/re_util.c`) exposes the
+shared `re_*` names alongside the lower-level `vs_regex_*` engine
+API.
+
+### API
+
+| Function | Returns |
+|---|---|
+| `vs_re_compile(pattern)`                       | `vs_regex*` (NULL on bad pattern) |
+| `vs_re_test(pattern, input)`                   | `bool` |
+| `vs_re_find(pattern, input)`                   | `vs_strvec` of `[whole, group1, …]` |
+| `vs_re_find_all(pattern, input)`               | `vs_strvec_vec` (one row per match) |
+| `vs_re_replace(pattern, input, replacement)`   | malloc'd `char*` |
+| `vs_re_replace_cb(re, input, cb, ud)`          | malloc'd `char*` (callback variant) |
+| `vs_re_escape(literal)`                        | malloc'd `char*` |
+
+The `_re` suffixed variants take an already-compiled `vs_regex*`.
+
+### Dialect
+
+The in-tree engine implements the RE2 subset documented in `/REGEX.md`:
+literals + escapes, `.`, `^`/`$`, `* + ? {n} {n,} {n,m}` (greedy + lazy),
+classes incl. `\d \w \s` and friends, `\b`/`\B`, `(...)` / `(?:...)`,
+alternation.
+
+**Not supported** (by design — RE2 doesn't either): backreferences,
+lookaround, possessive quantifiers, atomic groups. Backref patterns
+compile (the parser treats `\1` as a literal `1`) but never match
+back-reference semantics, so `vs_re_test("^(a+)\\1$", "aaaa")` returns
+`false` rather than erroring. Don't rely on this — write portable
+patterns.
+
+### Sharp edges (C-specific)
+
+- **No catastrophic backtracking.** Thompson-NFA construction means
+  P1/P2 from the discovery panel finish in microseconds regardless of
+  input length.
+- **Captures cap.** `VS_REGEX_MAX_GROUPS = 16` in `regex.h`. Patterns
+  with more capturing groups silently truncate.
+- **Memory management.** `vs_regex*`, `vs_strvec`, `vs_strvec_vec`,
+  and the `char*` returned by `re_replace` are all caller-owned. Use
+  `vs_regex_free`, `vs_strvec_free`, `vs_strvec_vec_free`, and `free`
+  respectively.
+- **Zero-width `re_replace`.** `vs_re_replace("a*", "abc", "X")`
+  returns `"XXbXcX"` — the convention shared with PCRE/ECMA/Java/.NET
+  and the other in-tree Thompson ports (Rust / Lua / Zig). Go (RE2)
+  returns `"XbXcX"` instead. (Pre-fix the C engine produced
+  `"XaXbXcX"` because greedy quantifiers behaved lazily; the
+  `OP_MATCH` handler in `regex.c` is now priority-correct.)
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/c/src/re_util.c b/c/src/re_util.c
index 973c37f..8533bd2 100644
--- a/c/src/re_util.c
+++ b/c/src/re_util.c
@@ -8,6 +8,7 @@
 #include "regex.h"
 #include "voxgig_struct.h"
 
+#include <stdbool.h>
 #include <stdlib.h>
 #include <string.h>
 
@@ -69,6 +70,90 @@ vs_strvec vs_re_find(const char* pattern, const char* input) {
   return out;
 }
 
+void vs_strvec_vec_init(vs_strvec_vec* v) {
+  v->len = 0;
+  v->cap = 0;
+  v->data = NULL;
+}
+
+void vs_strvec_vec_free(vs_strvec_vec* v) {
+  if (!v)
+    return;
+  for (size_t i = 0; i < v->len; i++) {
+    vs_strvec_free(&v->data[i]);
+  }
+  free(v->data);
+  v->data = NULL;
+  v->len = v->cap = 0;
+}
+
+static void vs_strvec_vec_push(vs_strvec_vec* v, vs_strvec row) {
+  if (v->len == v->cap) {
+    size_t nc = v->cap == 0 ? 4 : v->cap * 2;
+    v->data = (vs_strvec*)realloc(v->data, nc * sizeof(vs_strvec));
+    if (!v->data)
+      abort();
+    v->cap = nc;
+  }
+  v->data[v->len++] = row;
+}
+
+vs_strvec_vec vs_re_find_all_re(const vs_regex* re, const char* input) {
+  vs_strvec_vec out;
+  vs_strvec_vec_init(&out);
+  if (!re || !input)
+    return out;
+  size_t ilen = strlen(input);
+  /* Grow the caps buffer until vs_regex_find_all stops filling it. */
+  int max_matches = 64;
+  int per_row = 2 * VS_REGEX_MAX_GROUPS;
+  int* caps = NULL;
+  int count = 0;
+  for (;;) {
+    caps = (int*)realloc(caps, (size_t)(max_matches * per_row) * sizeof(int));
+    if (!caps)
+      abort();
+    count = vs_regex_find_all(re, input, ilen, caps, max_matches);
+    if (count < max_matches)
+      break;
+    max_matches *= 2;
+  }
+  int ngroups = vs_regex_ngroups(re);
+  /* vs_regex_find_all writes a fixed VS_REGEX_MAX_GROUPS pairs per row; any
+   * groups beyond that are silently dropped at the engine layer (the row
+   * isn't even wide enough to store them). Clamp here so we don't read past
+   * the row into the next match's bytes when ngroups > VS_REGEX_MAX_GROUPS. */
+  int capped = ngroups < VS_REGEX_MAX_GROUPS ? ngroups : VS_REGEX_MAX_GROUPS;
+  for (int m = 0; m < count; m++) {
+    int* row_caps = caps + m * per_row;
+    vs_strvec row;
+    vs_strvec_init(&row);
+    for (int g = 0; g < capped; g++) {
+      int s = row_caps[2 * g], e = row_caps[2 * g + 1];
+      if (s < 0 || e < s) {
+        vs_strvec_push(&row, "");
+      } else {
+        vs_strvec_push_n(&row, input + s, (size_t)(e - s));
+      }
+    }
+    /* Keep the row width == ngroups for caller consistency with
+     * vs_re_find/vs_re_find_re; the truncated groups are empty. */
+    for (int g = capped; g < ngroups; g++) {
+      vs_strvec_push(&row, "");
+    }
+    vs_strvec_vec_push(&out, row);
+  }
+  free(caps);
+  return out;
+}
+
+vs_strvec_vec vs_re_find_all(const char* pattern, const char* input) {
+  vs_regex* re = vs_regex_compile(pattern, NULL);
+  vs_strvec_vec out = vs_re_find_all_re(re, input);
+  vs_regex_free(re);
+  return out;
+}
+
 char* vs_re_replace_re(const vs_regex* re, const char* input, const char* replacement) {
   if (!re)
     return rdup(input);
diff --git a/c/src/regex.c b/c/src/regex.c
index 65febba..da39222 100644
--- a/c/src/regex.c
+++ b/c/src/regex.c
@@ -825,12 +825,17 @@ static bool match_at(const vs_regex* re, const char* input, size_t ilen, int sta
         if (c >= 0 && cc_has(&in->data.cc, c))
           tl_add(&nxt, th->pc + 1, th->slots, nslots, sp + 1, re, input, ilen);
       } else if (in->op == OP_MATCH) {
-        if (!found) {
-          found = true;
-          memcpy(best_slots, th->slots, (size_t)nslots * sizeof(int));
-        }
-        /* Higher-priority threads come first; once we've matched, lower
-           priority threads in this generation can be skipped. */
+        /* Always overwrite: threads are priority-ordered (highest first),
+         * and lower-priority threads after this one don't get processed
+         * (we break below). Across sp, a later MATCH can only arrive from
+         * descendants of HIGHER-priority threads (threads[k+1..]'s
+         * descendants are never added to nxt once we break here). So
+         * overwriting unconditionally implements leftmost-longest /
+         * leftmost-first correctly. The earlier `if (!found)` made greedy
+         * quantifiers behave lazily — e.g. `a*` on "abc" matched "" not "a".
+         */
+        found = true;
+        memcpy(best_slots, th->slots, (size_t)nslots * sizeof(int));
         break;
       }
     }
@@ -843,14 +848,18 @@ static bool match_at(const vs_regex* re, const char* input, size_t ilen, int sta
       break;
   }
   /* Handle EOI: drain the remaining current threads (some may have advanced
-     past the last char and now point at MATCH via epsilons). */
+     past the last char and now point at MATCH via epsilons). At this point
+     the threads are still priority-ordered, and the first MATCH (highest
+     priority) is the canonical leftmost-first within this generation —
+     but any earlier-recorded MATCH at a prior sp was from a LOWER-priority
+     thread (those at higher indices that came BEFORE the surviving high-
+     priority threads got to consume an extra char), so an EOI MATCH here
+     should always overwrite. */
   for (int i = 0; i < cur.len; i++) {
     thread_t* th = &cur.threads[i];
     if (re->code[th->pc].op == OP_MATCH) {
-      if (!found) {
-        found = true;
-        memcpy(best_slots, th->slots, (size_t)nslots * sizeof(int));
-      }
+      found = true;
+      memcpy(best_slots, th->slots, (size_t)nslots * sizeof(int));
       break;
     }
   }
diff --git a/c/src/voxgig_struct.h b/c/src/voxgig_struct.h
index b96ffd6..303d41d 100644
--- a/c/src/voxgig_struct.h
+++ b/c/src/voxgig_struct.h
@@ -64,6 +64,20 @@ bool vs_re_test_re(const vs_regex* re, const char* input);
 vs_strvec vs_re_find(const char* pattern, const char* input);
 vs_strvec vs_re_find_re(const vs_regex* re, const char* input);
 
+/* List-of-lists of strings — one vs_strvec per match (each row is
+ * [whole, capture1, ...]). Caller must vs_strvec_vec_free() to release. */
+typedef struct vs_strvec_vec {
+  size_t len;
+  size_t cap;
+  vs_strvec* data;
+} vs_strvec_vec;
+
+void vs_strvec_vec_init(vs_strvec_vec* v);
+void vs_strvec_vec_free(vs_strvec_vec* v);
+
+vs_strvec_vec vs_re_find_all(const char* pattern, const char* input);
+vs_strvec_vec vs_re_find_all_re(const vs_regex* re, const char* input);
+
 /* Returns malloc'd string. */
 char* vs_re_replace(const char* pattern, const char* input, const char* replacement);
 char* vs_re_replace_re(const vs_regex* re, const char* input, const char* replacement);
diff --git a/c/tests/regex_pathological.c b/c/tests/regex_pathological.c
new file mode 100644
index 0000000..a7d5d9c
--- /dev/null
+++ b/c/tests/regex_pathological.c
@@ -0,0 +1,139 @@
+/* Discovery test: pathological regex inputs run against the port's vs_re_*
+ * API. Goal is to surface failures across ports, not to assert behaviour.
+ * The panel is the same in every port (see REGEX.md).
+ *
+ * C has no exception machinery, so this records the return value (or NULL)
+ * for each case. A crash here means the engine aborted on that input.
+ */
+
+#include "voxgig_struct.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <time.h>
+
+static double now_ms(void) {
+  struct timespec ts;
+  clock_gettime(CLOCK_MONOTONIC, &ts);
+  return ts.tv_sec * 1000.0 + ts.tv_nsec / 1e6;
+}
+
+static char* repeat(char c, size_t n) {
+  char* s = (char*)malloc(n + 1);
+  memset(s, c, n);
+  s[n] = '\0';
+  return s;
+}
+
+static void print_strvec(const vs_strvec* v) {
+  printf("[");
+  for (size_t i = 0; i < v->len; i++) {
+    printf("%s\"%s\"", i ? "," : "", v->data[i] ? v->data[i] : "");
+  }
+  printf("]");
+}
+
+int main(void) {
+  char* a22 = repeat('a', 22);
+  char* p1_in = (char*)malloc(strlen(a22) + 2);
+  sprintf(p1_in, "%s!", a22);
+
+  char* opens = repeat('(', 40);
+  char* closes = repeat(')', 40);
+  char* nest40 = (char*)malloc(40 + 1 + 40 + 1);
+  sprintf(nest40, "%sa%s", opens, closes);
+
+  double t0, ms;
+
+  /* P1 */
+  t0 = now_ms();
+  bool b1 = vs_re_test("^(a+)+$", p1_in);
+  ms = now_ms() - t0;
+  printf("[regex-discovery] P1_redos_nested_plus | %.2fms | OK | %s\n", ms, b1 ? "true" : "false");
+
+  /* P2 */
+  t0 = now_ms();
+  bool b2 = vs_re_test("^(a|aa)+$", p1_in);
+  ms = now_ms() - t0;
+  printf("[regex-discovery] P2_redos_alt_overlap | %.2fms | OK | %s\n", ms, b2 ? "true" : "false");
+
+  /* P3 */
+  t0 = now_ms();
+  char* p3 = vs_re_replace("a*", "abc", "X");
+  ms = now_ms() - t0;
+  printf("[regex-discovery] P3_empty_repeat_replace | %.2fms | OK | \"%s\"\n", ms,
+         p3 ? p3 : "(null)");
+  free(p3);
+
+  /* P4 */
+  t0 = now_ms();
+  char* p4 = vs_re_replace("\\.", "café.au.lait", "/");
+  ms = now_ms() - t0;
+  printf("[regex-discovery] P4_unicode_replace_dot | %.2fms | OK | \"%s\"\n", ms,
+         p4 ? p4 : "(null)");
+  free(p4);
+
+  /* P5 */
+  t0 = now_ms();
+  vs_strvec p5 = vs_re_find("é", "café au lait");
+  ms = now_ms() - t0;
+  printf("[regex-discovery] P5_unicode_find_codepoint | %.2fms | OK | ", ms);
+  print_strvec(&p5);
+  printf("\n");
+  vs_strvec_free(&p5);
+
+  /* P6 */
+  t0 = now_ms();
+  bool b6 = vs_re_test(nest40, "a");
+  ms = now_ms() - t0;
+  printf("[regex-discovery] P6_deep_nesting_compile | %.2fms | OK | %s\n", ms,
+         b6 ? "true" : "false");
+
+  /* P7 */
+  t0 = now_ms();
+  char* p7_in = (char*)malloc(12);
+  sprintf(p7_in, "%sb", "aaaaaaaaaa");
+  bool b7 = vs_re_test("^a{0,10000}b$", p7_in);
+  ms = now_ms() - t0;
+  printf("[regex-discovery] P7_big_bounded_quantifier | %.2fms | OK | %s\n", ms,
+         b7 ? "true" : "false");
+  free(p7_in);
+
+  /* P8 — invalid pattern. vs_re_compile returns NULL on error. */
+  t0 = now_ms();
+  vs_regex* p8 = vs_re_compile("[abc");
+  ms = now_ms() - t0;
+  if (p8) {
+    printf("[regex-discovery] P8_invalid_pattern | %.2fms | OK | \"compiled-without-error\"\n", ms);
+    /* leak: no vs_regex_free in public header */
+  } else {
+    printf("[regex-discovery] P8_invalid_pattern | %.2fms | ERR | compile returned NULL\n", ms);
+  }
+
+  /* P9 */
+  t0 = now_ms();
+  bool b9 = vs_re_test("^(a+)\\1$", "aaaa");
+  ms = now_ms() - t0;
+  printf("[regex-discovery] P9_backref_re2_forbidden | %.2fms | OK | %s\n", ms,
+         b9 ? "true" : "false");
+
+  /* P10 */
+  t0 = now_ms();
+  vs_strvec_vec p10 = vs_re_find_all("a*", "bbb");
+  ms = now_ms() - t0;
+  printf("[regex-discovery] P10_find_all_zero_width | %.2fms | OK | [", ms);
+  for (size_t i = 0; i < p10.len; i++) {
+    if (i)
+      printf(",");
+    print_strvec(&p10.data[i]);
+  }
+  printf("]\n");
+  vs_strvec_vec_free(&p10);
+
+  free(a22);
+  free(p1_in);
+  free(opens);
+  free(closes);
+  free(nest40);
+  return 0;
+}
diff --git a/cpp/README.md b/cpp/README.md
index 22fb3c1..6bca0f7 100644
--- a/cpp/README.md
+++ b/cpp/README.md
@@ -233,6 +233,43 @@ Catch2 framework with limited test coverage.  See the
 [overview](./overview/) directory for current API examples.
 
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The C++ port
+wraps `<regex>` (C++11), which defaults to the ECMAScript dialect.
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `re_compile(pattern)`             | `std::regex(pattern)` (throws `std::regex_error` on bad pattern) |
+| `re_test(pattern, input)`         | `std::regex_search` → bool |
+| `re_find(pattern, input)`         | first match groups as `std::vector<std::string>` (empty if no match) |
+| `re_find_all(pattern, input)`     | `std::vector<std::vector<std::string>>` |
+| `re_replace(pattern, input, rep)` | `std::regex_replace(input, re, rep)` |
+| `re_escape(s)`                    | escape regex metacharacters |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+`std::regex` defaults to ECMAScript syntax and supports backreferences
+and lookaround; using them will not be portable.
+
+### Sharp edges (C++-specific)
+
+- **libstdc++ `<regex>` has the worst-in-class catastrophic
+  backtracking.** The discovery panel measures **~1.2 s** for
+  `^(a+)+$` over 22 a's plus `!`. This is well-known and is the
+  reason many production C++ projects avoid `<regex>` in favour of
+  RE2 or PCRE2. Stay inside the RE2 subset and avoid nested
+  quantifiers; even then, performance won't match the dedicated
+  engines.
+- **Zero-width `replace`.** `re_replace("a*", "abc", "X")` returns
+  `"XXbXcX"` — the ECMA convention shared by all PCRE/ECMA/.NET/Java/Onigmo engines plus the in-tree Thompson ports. Go (RE2) returns `"XbXcX"` instead; see `/REGEX_PATHOLOGICAL.md`.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/cpp/tests/regex_pathological.cpp b/cpp/tests/regex_pathological.cpp
new file mode 100644
index 0000000..4ce91a6
--- /dev/null
+++ b/cpp/tests/regex_pathological.cpp
@@ -0,0 +1,90 @@
+// Discovery test: pathological regex inputs run against the port's re_* API.
+// Goal is to surface failures across ports, not to assert behaviour.
+// Panel is the same in every port (see REGEX.md).
+
+#include "voxgig_struct.hpp"
+
+#include <chrono>
+#include <cstdio>
+#include <functional>
+#include <regex>
+#include <sstream>
+#include <string>
+
+using namespace voxgig::structlib;
+
+// Render outcomes as JSON-ish so output matches the other ports.
+static std::string j_str(const std::string& s) {
+  std::string out = "\"";
+  for (char c : s) {
+    if (c == '"' || c == '\\')
+      out.push_back('\\'), out.push_back(c);
+    else
+      out.push_back(c);
+  }
+  out.push_back('"');
+  return out;
+}
+
+template <typename F> static void record(const char* label, F fn) {
+  auto t0 = std::chrono::steady_clock::now();
+  std::string outcome;
+  try {
+    outcome = std::string("OK | ") + fn();
+  } catch (const std::exception& e) {
+    outcome = std::string("ERR | ") + typeid(e).name() + ": " + e.what();
+  } catch (...) {
+    outcome = "ERR | unknown exception";
+  }
+  double ms =
+      std::chrono::duration<double, std::milli>(std::chrono::steady_clock::now() - t0).count();
+  std::printf("[regex-discovery] %s | %.2fms | %s\n", label, ms, outcome.c_str());
+}
+
+static std::string as_bool(bool b) {
+  return b ? "true" : "false";
+}
+
+static std::string as_vec(const std::vector<std::string>& v) {
+  std::string s = "[";
+  for (size_t i = 0; i < v.size(); i++) {
+    if (i)
+      s += ",";
+    s += j_str(v[i]);
+  }
+  s += "]";
+  return s;
+}
+
+static std::string as_vec2(const std::vector<std::vector<std::string>>& v) {
+  std::string s = "[";
+  for (size_t i = 0; i < v.size(); i++) {
+    if (i)
+      s += ",";
+    s += as_vec(v[i]);
+  }
+  s += "]";
+  return s;
+}
+
+int main() {
+  std::string a22(22, 'a');
+  std::string nest40 = std::string(40, '(') + "a" + std::string(40, ')');
+
+  record("P1_redos_nested_plus", [&] { return as_bool(re_test("^(a+)+$", a22 + "!")); });
+  record("P2_redos_alt_overlap", [&] { return as_bool(re_test("^(a|aa)+$", a22 + "!")); });
+  record("P3_empty_repeat_replace", [&] { return j_str(re_replace("a*", "abc", "X")); });
+  record("P4_unicode_replace_dot", [&] { return j_str(re_replace("\\.", "café.au.lait", "/")); });
+  record("P5_unicode_find_codepoint", [&] { return as_vec(re_find("é", "café au lait")); });
+  record("P6_deep_nesting_compile", [&] { return as_bool(re_test(nest40, "a")); });
+  record("P7_big_bounded_quantifier",
+         [&] { return as_bool(re_test("^a{0,10000}b$", std::string(10, 'a') + "b")); });
+  record("P8_invalid_pattern", [&] {
+    (void) re_compile("[abc");
+    return std::string("\"compiled\"");
+  });
+  record("P9_backref_re2_forbidden", [&] { return as_bool(re_test("^(a+)\\1$", "aaaa")); });
+  record("P10_find_all_zero_width", [&] { return as_vec2(re_find_all("a*", "bbb")); });
+
+  return 0;
+}
diff --git a/csharp/README.md b/csharp/README.md
index a22765b..68e114e 100644
--- a/csharp/README.md
+++ b/csharp/README.md
@@ -260,6 +260,42 @@ In progress.  Coverage of canonical functions is broad; check
 [`../REPORT.md`](../REPORT.md) for the latest status.
 
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The C# port
+wraps `System.Text.RegularExpressions.Regex`.
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `ReCompile(pattern)`             | `new Regex(pattern)` (throws `RegexParseException` on bad pattern) |
+| `ReTest(pattern, input)`         | `Regex.IsMatch(input, pattern)` |
+| `ReFind(pattern, input)`         | first match as `string[]` of `[whole, group1, …]` or `null` |
+| `ReFindAll(pattern, input)`      | `List<string[]>` |
+| `ReReplace(pattern, input, rep)` | `Regex.Replace(input, pattern, rep)` |
+| `ReEscape(s)`                    | `Regex.Escape(s)` |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+.NET regex supports backreferences and lookaround; using them will not
+be portable.
+
+### Sharp edges
+
+- **Catastrophic backtracking.** .NET's regex is backtracking; the
+  discovery panel sees P1 (`^(a+)+$` over 22 a's plus `!`) in
+  ~390 ms here. .NET 7+ ships a non-backtracking engine you can opt
+  into via `RegexOptions.NonBacktracking` — consider it for
+  untrusted patterns. Stay inside the RE2 subset and prefer flat
+  patterns.
+- **Zero-width `replace`.** `ReReplace("a*", "abc", "X")` returns
+  `"XXbXcX"` — the ECMA convention shared by all PCRE/ECMA/.NET/Java/Onigmo engines plus the in-tree Thompson ports. Go (RE2) returns `"XbXcX"` instead; see `/REGEX_PATHOLOGICAL.md`.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/csharp/tests/RegexPathologicalTest.cs b/csharp/tests/RegexPathologicalTest.cs
new file mode 100644
index 0000000..122a632
--- /dev/null
+++ b/csharp/tests/RegexPathologicalTest.cs
@@ -0,0 +1,54 @@
+/* Copyright (c) 2025-2026 Voxgig Ltd. MIT LICENSE. */
+
+// RUN: cd csharp/tests && dotnet test --filter "DisplayName~RegexPathological"
+//
+// Discovery test: pathological regex inputs run against the port's Re* API.
+// Goal is to surface failures across ports, not to assert behaviour.
+// Panel is the same in every port (see REGEX.md).
+
+using System;
+using System.Diagnostics;
+using System.Text.Json;
+using static Voxgig.Struct.StructUtils;
+using Xunit;
+
+namespace Voxgig.Tests;
+
+public class RegexPathologicalTest
+{
+    private static void Record(string label, Func<object?> fn)
+    {
+        var sw = Stopwatch.StartNew();
+        string outcome;
+        try
+        {
+            var r = fn();
+            outcome = "OK | " + JsonSerializer.Serialize(r);
+        }
+        catch (Exception e)
+        {
+            outcome = "ERR | " + e.GetType().Name + ": " + e.Message;
+        }
+        sw.Stop();
+        var ms = sw.Elapsed.TotalMilliseconds;
+        Console.WriteLine($"[regex-discovery] {label} | {ms:F2}ms | {outcome}");
+    }
+
+    [Fact]
+    public void Panel()
+    {
+        var a22 = new string('a', 22);
+        var nest40 = new string('(', 40) + "a" + new string(')', 40);
+
+        Record("P1_redos_nested_plus",      () => ReTest("^(a+)+$", a22 + "!"));
+        Record("P2_redos_alt_overlap",      () => ReTest("^(a|aa)+$", a22 + "!"));
+        Record("P3_empty_repeat_replace",   () => ReReplace("a*", "abc", "X"));
+        Record("P4_unicode_replace_dot",    () => ReReplace("\\.", "café.au.lait", "/"));
+        Record("P5_unicode_find_codepoint", () => ReFind("é", "café au lait"));
+        Record("P6_deep_nesting_compile",   () => ReTest(nest40, "a"));
+        Record("P7_big_bounded_quantifier", () => ReTest("^a{0,10000}b$", new string('a', 10) + "b"));
+        Record("P8_invalid_pattern",        () => ReCompile("[abc"));
+        Record("P9_backref_re2_forbidden",  () => ReTest("^(a+)\\1$", "aaaa"));
+        Record("P10_find_all_zero_width",   () => ReFindAll("a*", "bbb"));
+    }
+}
diff --git a/cspell.json b/cspell.json
index c9e400b..f699272 100644
--- a/cspell.json
+++ b/cspell.json
@@ -55,6 +55,8 @@
     "perigolo",
     "Rodger",
     "Lovelace",
+    "Dfile",
+    "mojibake",
     "getpath",
     "setpath",
     "getprop",
diff --git a/go/README.md b/go/README.md
index bf6cea2..86989fb 100644
--- a/go/README.md
+++ b/go/README.md
@@ -401,6 +401,54 @@ canonical "lists are reference-stable" assumption.
 92/92 tests pass against the shared corpus.
 
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The Go port
+wraps the stdlib `regexp` package — Go's `regexp` *is* the RE2
+reference implementation.
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `ReCompile(pattern)`              | `regexp.MustCompile(pattern)` (panics on bad pattern) |
+| `ReTest(pattern, input)`          | `re.MatchString(input)` |
+| `ReFind(pattern, input)`          | `re.FindStringSubmatch(input)` |
+| `ReFindAll(pattern, input)`       | `re.FindAllStringSubmatch(input, -1)` |
+| `ReReplace(pattern, input, rep)`  | `re.ReplaceAllString(input, rep)` |
+| `ReReplaceFunc(pattern, input,f)` | `re.ReplaceAllStringFunc(input, f)` |
+| `ReEscape(s)`                     | alias for `EscRe(s)` |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+Since Go's regexp engine *is* RE2, this is the natural ceiling: there is
+no PCRE escape hatch.
+
+### Sharp edges (Go-specific)
+
+- **`ReCompile` panics.** It's a pass-through to `regexp.MustCompile`,
+  so an invalid pattern aborts via `panic`. This matches the
+  throw/raise behaviour of every other port; wrap in `recover()` if
+  you accept user-supplied patterns.
+- **Bounded quantifier cap.** RE2 refuses `{n,m}` with `m > 1000`.
+  `^a{0,10000}b$` *panics* at compile time with "invalid repeat
+  count". This is a hard RE2 limit — no portable workaround. The
+  canonical patterns and `$LIKE` operator stay well below it.
+- **No backreferences or lookaround.** RE2 does not support them by
+  design. `^(a+)\1$` panics on compile. The cross-port dialect already
+  forbids them; this is the engine that enforces the rule hardest.
+- **Zero-width `re_replace` uses RE2's convention.**
+  `re_replace("a*", "abc", "X")` returns `"XbXcX"` — RE2 suppresses
+  an empty match immediately after a non-empty match at the same
+  offset. PCRE / ECMA / .NET / Java / the in-tree Thompson ports all
+  return `"XXbXcX"` instead. This is inherent to Go's host regex
+  package and is **not** wrapped: portable callers should not depend
+  on cross-port identity of zero-width replacement output.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/go/regex_pathological_test.go b/go/regex_pathological_test.go
new file mode 100644
index 0000000..ef673c0
--- /dev/null
+++ b/go/regex_pathological_test.go
@@ -0,0 +1,54 @@
+// RUN: go test -run=TestRegexPathological -v
+//
+// Discovery test: pathological regex inputs run against the port's Re* API.
+// Goal is to surface failures across ports, not to assert behaviour.
+// Panel is the same in every port (see REGEX.md).
+
+package voxgigstruct_test
+
+import (
+	"encoding/json"
+	"fmt"
+	"strings"
+	"testing"
+	"time"
+
+	voxgigstruct "github.com/voxgig/struct/go"
+)
+
+func record(label string, fn func() any) {
+	t0 := time.Now()
+	var outcome string
+	func() {
+		defer func() {
+			if r := recover(); r != nil {
+				outcome = fmt.Sprintf("ERR | panic: %v", r)
+			}
+		}()
+		r := fn()
+		b, err := json.Marshal(r)
+		if err != nil {
+			outcome = fmt.Sprintf("OK | <unjsonable %T>: %v", r, r)
+			return
+		}
+		outcome = fmt.Sprintf("OK | %s", string(b))
+	}()
+	ms := float64(time.Since(t0).Microseconds()) / 1000.0
+	fmt.Printf("[regex-discovery] %s | %.2fms | %s\n", label, ms, outcome)
+}
+
+func TestRegexPathological(t *testing.T) {
+	a22 := strings.Repeat("a", 22)
+	nest40 := strings.Repeat("(", 40) + "a" + strings.Repeat(")", 40)
+
+	record("P1_redos_nested_plus", func() any { return voxgigstruct.ReTest("^(a+)+$", a22+"!") })
+	record("P2_redos_alt_overlap", func() any { return voxgigstruct.ReTest("^(a|aa)+$", a22+"!") })
+	record("P3_empty_repeat_replace", func() any { return voxgigstruct.ReReplace("a*", "abc", "X") })
+	record("P4_unicode_replace_dot", func() any { return voxgigstruct.ReReplace(`\.`, "café.au.lait", "/") })
+	record("P5_unicode_find_codepoint", func() any { return voxgigstruct.ReFind("é", "café au lait") })
+	record("P6_deep_nesting_compile", func() any { return voxgigstruct.ReTest(nest40, "a") })
+	record("P7_big_bounded_quantifier", func() any { return voxgigstruct.ReTest("^a{0,10000}b$", strings.Repeat("a", 10)+"b") })
+	record("P8_invalid_pattern", func() any { return voxgigstruct.ReCompile("[abc") != nil })
+	record("P9_backref_re2_forbidden", func() any { return voxgigstruct.ReTest(`^(a+)\1$`, "aaaa") })
+	record("P10_find_all_zero_width", func() any { return voxgigstruct.ReFindAll("a*", "bbb") })
+}
diff --git a/go/voxgigstruct.go b/go/voxgigstruct.go
index 7c64397..790841d 100644
--- a/go/voxgigstruct.go
+++ b/go/voxgigstruct.go
@@ -993,6 +993,13 @@ func ReFindAll(pattern, input string) [][]string {
 
 // ReReplace replaces every match. The replacement supports Go's $0..$N
 // reference syntax (functionally equivalent to JS $&..$N).
+//
+// Note: Go's `regexp` (RE2) suppresses an empty match immediately
+// following a non-empty match at the same offset. This is RE2's
+// chosen convention and differs from ECMAScript / Python / Java etc:
+// `re_replace("a*", "abc", "X")` returns "XbXcX" here, "XXbXcX" on
+// PCRE/ECMA engines. The variance is inherent to the host regex
+// package; see REGEX_PATHOLOGICAL.md.
 func ReReplace(pattern, input, replacement string) string {
 	return regexp.MustCompile(pattern).ReplaceAllString(input, replacement)
 }
diff --git a/java/README.md b/java/README.md
index 71714ec..5af13d3 100644
--- a/java/README.md
+++ b/java/README.md
@@ -247,6 +247,46 @@ No standard test runner configured yet.  `StructTest.java` exists
 but is minimal.
 
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The Java port
+wraps `java.util.regex.Pattern`.
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `reCompile(pattern)`              | `Pattern.compile(pattern)` (throws `PatternSyntaxException` on bad pattern) |
+| `reTest(pattern, input)`          | `Pattern.compile(pattern).matcher(input).find()` |
+| `reFind(pattern, input)`          | first match as `String[]` of `[whole, group1, …]` or `null` |
+| `reFindAll(pattern, input)`       | `List<String[]>` |
+| `reReplace(pattern, input, repl)` | `matcher.replaceAll(repl)` |
+| `reEscape(s)`                     | escape regex metacharacters |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+Java's regex supports backreferences and lookaround; using them will
+not be portable.
+
+### Sharp edges
+
+- **Catastrophic backtracking.** `java.util.regex` is backtracking;
+  the discovery panel sees P1 (`^(a+)+$` over 22 a's plus `!`) in
+  ~13 ms here. Other shapes can be worse. Prefer flat patterns.
+- **Zero-width `replace`.** `reReplace("a*", "abc", "X")` returns
+  `"XXbXcX"` — the ECMA convention shared by all PCRE/ECMA/.NET/Java/Onigmo engines plus the in-tree Thompson ports. Go (RE2) returns `"XbXcX"` instead; see `/REGEX_PATHOLOGICAL.md`.
+- **`System.out` encoding.** When printing match results that contain
+  non-ASCII characters, `System.out`'s default `PrintStream` uses the
+  platform's default charset, not UTF-8. The discovery panel sees
+  `caf?` in stdout though the in-memory `String` is correct UTF-16.
+  Pass `-Dfile.encoding=UTF-8` (or use `PrintStream(System.out, true,
+  StandardCharsets.UTF_8)`) when this matters. Orthogonal to the
+  regex itself.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/java/src/test/RegexPathologicalTest.java b/java/src/test/RegexPathologicalTest.java
new file mode 100644
index 0000000..b1b6870
--- /dev/null
+++ b/java/src/test/RegexPathologicalTest.java
@@ -0,0 +1,46 @@
+// RUN: mvn -Dtest=RegexPathologicalTest test
+//
+// Discovery test: pathological regex inputs run against the port's re* API.
+// Goal is to surface failures across ports, not to assert behaviour.
+// Panel is the same in every port (see REGEX.md).
+
+package voxgig.struct;
+
+import com.google.gson.Gson;
+import org.junit.jupiter.api.Test;
+
+import java.util.function.Supplier;
+
+class RegexPathologicalTest {
+  private static final Gson GSON = new Gson();
+
+  private static void record(String label, Supplier<Object> fn) {
+    long t0 = System.nanoTime();
+    String outcome;
+    try {
+      Object r = fn.get();
+      outcome = "OK | " + GSON.toJson(r);
+    } catch (Throwable e) {
+      outcome = "ERR | " + e.getClass().getSimpleName() + ": " + e.getMessage();
+    }
+    double ms = (System.nanoTime() - t0) / 1e6;
+    System.out.printf("[regex-discovery] %s | %.2fms | %s%n", label, ms, outcome);
+  }
+
+  @Test
+  void panel() {
+    String a22 = "a".repeat(22);
+    String nest40 = "(".repeat(40) + "a" + ")".repeat(40);
+
+    record("P1_redos_nested_plus",      () -> Struct.reTest("^(a+)+$", a22 + "!"));
+    record("P2_redos_alt_overlap",      () -> Struct.reTest("^(a|aa)+$", a22 + "!"));
+    record("P3_empty_repeat_replace",   () -> Struct.reReplace("a*", "abc", "X"));
+    record("P4_unicode_replace_dot",    () -> Struct.reReplace("\\.", "café.au.lait", "/"));
+    record("P5_unicode_find_codepoint", () -> Struct.reFind("é", "café au lait"));
+    record("P6_deep_nesting_compile",   () -> Struct.reTest(nest40, "a"));
+    record("P7_big_bounded_quantifier", () -> Struct.reTest("^a{0,10000}b$", "a".repeat(10) + "b"));
+    record("P8_invalid_pattern",        () -> Struct.reCompile("[abc"));
+    record("P9_backref_re2_forbidden",  () -> Struct.reTest("^(a+)\\1$", "aaaa"));
+    record("P10_find_all_zero_width",   () -> Struct.reFindAll("a*", "bbb"));
+  }
+}
diff --git a/javascript/README.md b/javascript/README.md
index 36a41b9..c5c858d 100644
--- a/javascript/README.md
+++ b/javascript/README.md
@@ -367,6 +367,42 @@ Otherwise functionally identical -- both run on V8.
 84/84 tests pass against the shared corpus.
 
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). On JavaScript
+this is the ECMAScript `RegExp` built-in.
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `re_compile(pattern, flags?)`     | `new RegExp(pattern, flags ?? 'g')` |
+| `re_test(pattern, input)`         | `pattern.test(input)` |
+| `re_find(pattern, input)`         | `input.match(pattern)` (non-global pattern) |
+| `re_find_all(pattern, input)`     | `[...input.matchAll(pattern)]` |
+| `re_replace(pattern, input, rep)` | `input.replace(pattern, rep)` (global pattern) |
+| `re_escape(s)`                    | escape `[.*+?^${}()|[\]\\]` in `s` |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+`RegExp` itself supports backreferences and lookaround, but other ports
+do not, so using those will not be portable.
+
+### Sharp edges
+
+- **Catastrophic backtracking.** `RegExp` is a backtracking engine;
+  nested quantifiers like `(a+)+` against a non-matching suffix can be
+  exponential in input length (the discovery panel sees ~180 ms on
+  Node 22 vs <0.1 ms on RE2-style engines). Prefer flat patterns and
+  character classes over alternations.
+- **Zero-width `replace`.** `re_replace("a*", "abc", "X")` returns
+  `"XXbXcX"` — the ECMA convention shared by all PCRE/ECMA/.NET/Java/Onigmo engines plus the in-tree Thompson ports. Go (RE2) returns `"XbXcX"` instead; see `/REGEX_PATHOLOGICAL.md`.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input
+panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/javascript/test/regex_pathological.test.js b/javascript/test/regex_pathological.test.js
new file mode 100644
index 0000000..391e7a1
--- /dev/null
+++ b/javascript/test/regex_pathological.test.js
@@ -0,0 +1,45 @@
+// VERSION: @voxgig/struct 0.1.0
+//
+// Discovery test: pathological regex inputs run against the port's re_* API.
+// The goal is to surface which inputs cause errors, hangs, or surprising
+// output across ports — NOT to assert any specific behaviour. Each case
+// wraps the call in try/catch so one failure does not mask the others.
+// The panel is the same in every port (see REGEX.md).
+
+const { test } = require('node:test')
+const struct = require('../src/struct')
+
+const { re_compile, re_test, re_find, re_find_all, re_replace } = struct
+
+function rep(s, n) {
+  return new Array(n + 1).join(s)
+}
+
+function record(label, fn) {
+  const t0 = process.hrtime.bigint()
+  let outcome
+  try {
+    const r = fn()
+    outcome = `OK | ${JSON.stringify(r)}`
+  } catch (e) {
+    outcome = `ERR | ${e && e.message ? e.message : String(e)}`
+  }
+  const ms = Number(process.hrtime.bigint() - t0) / 1e6
+  console.log(`[regex-discovery] ${label} | ${ms.toFixed(2)}ms | ${outcome}`)
+}
+
+test('regex pathological discovery', () => {
+  const A22 = rep('a', 22)
+  const NEST40 = rep('(', 40) + 'a' + rep(')', 40)
+
+  record('P1_redos_nested_plus', () => re_test('^(a+)+$', A22 + '!'))
+  record('P2_redos_alt_overlap', () => re_test('^(a|aa)+$', A22 + '!'))
+  record('P3_empty_repeat_replace', () => re_replace('a*', 'abc', 'X'))
+  record('P4_unicode_replace_dot', () => re_replace('\\.', 'café.au.lait', '/'))
+  record('P5_unicode_find_codepoint', () => re_find('é', 'café au lait'))
+  record('P6_deep_nesting_compile', () => re_test(NEST40, 'a'))
+  record('P7_big_bounded_quantifier', () => re_test('^a{0,10000}b$', rep('a', 10) + 'b'))
+  record('P8_invalid_pattern', () => re_compile('[abc'))
+  record('P9_backref_re2_forbidden', () => re_test('^(a+)\\1$', 'aaaa'))
+  record('P10_find_all_zero_width', () => re_find_all('a*', 'bbb'))
+})
diff --git a/kotlin/src/test/kotlin/voxgig/struct/RegexPathologicalTest.kt b/kotlin/src/test/kotlin/voxgig/struct/RegexPathologicalTest.kt
new file mode 100644
index 0000000..f7463db
--- /dev/null
+++ b/kotlin/src/test/kotlin/voxgig/struct/RegexPathologicalTest.kt
@@ -0,0 +1,45 @@
+// Discovery test: pathological regex inputs run against the port's re* API.
+// Goal is to surface failures across ports, not to assert behaviour.
+// Panel is the same in every port (see REGEX.md).
+
+package voxgig.struct
+
+import com.google.gson.Gson
+import kotlin.test.Test
+
+class RegexPathologicalTest {
+    private val gson = Gson()
+
+    private fun record(
+        label: String,
+        fn: () -> Any?,
+    ) {
+        val t0 = System.nanoTime()
+        val outcome: String =
+            try {
+                val r = fn()
+                "OK | " + gson.toJson(r)
+            } catch (e: Throwable) {
+                "ERR | ${e::class.simpleName}: ${e.message}"
+            }
+        val ms = (System.nanoTime() - t0) / 1e6
+        println("[regex-discovery] %s | %.2fms | %s".format(label, ms, outcome))
+    }
+
+    @Test
+    fun panel() {
+        val a22 = "a".repeat(22)
+        val nest40 = "(".repeat(40) + "a" + ")".repeat(40)
+
+        record("P1_redos_nested_plus") { Struct.reTest("^(a+)+\$", a22 + "!") }
+        record("P2_redos_alt_overlap") { Struct.reTest("^(a|aa)+\$", a22 + "!") }
+        record("P3_empty_repeat_replace") { Struct.reReplace("a*", "abc", "X") }
+        record("P4_unicode_replace_dot") { Struct.reReplace("\\.", "café.au.lait", "/") }
+        record("P5_unicode_find_codepoint") { Struct.reFind("é", "café au lait") }
+        record("P6_deep_nesting_compile") { Struct.reTest(nest40, "a") }
+        record("P7_big_bounded_quantifier") { Struct.reTest("^a{0,10000}b\$", "a".repeat(10) + "b") }
+        record("P8_invalid_pattern") { Struct.reCompile("[abc") }
+        record("P9_backref_re2_forbidden") { Struct.reTest("^(a+)\\1\$", "aaaa") }
+        record("P10_find_all_zero_width") { Struct.reFindAll("a*", "bbb") }
+    }
+}
diff --git a/lua/README.md b/lua/README.md
index 324eca9..c1243b3 100644
--- a/lua/README.md
+++ b/lua/README.md
@@ -370,6 +370,54 @@ reference-stable" assumption holds without a wrapper.
 75/75 tests pass against the shared corpus.
 
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The Lua port
+**ships its own RE2-subset engine** in `src/regex.lua` (~500 LOC of
+pure Lua — Lua's built-in pattern language is intentionally not
+regex, so we vendor one). No LuaRocks dependency, no FFI.
+
+### API
+
+| Function | Returns |
+|---|---|
+| `re.re_compile(pattern)`              | compiled regex object |
+| `re.re_test(pattern, input)`          | `true` / `false` |
+| `re.re_find(pattern, input)`          | `{whole, group1, …}` or `nil` |
+| `re.re_find_all(pattern, input)`      | `{ {whole, group1, …}, … }` |
+| `re.re_replace(pattern, input, repl)` | `string` |
+| `re.re_escape(literal)`               | `string` |
+
+### Dialect
+
+The in-tree engine implements the RE2 subset documented in
+`/REGEX.md`: literals + escapes, `.`, `^`/`$`, `* + ? {n} {n,} {n,m}`
+(greedy + lazy), classes incl. `\d \w \s` and friends, `\b`/`\B`,
+`(...)` / `(?:...)`, alternation.
+
+**Not supported** (by design — RE2 doesn't either): backreferences,
+lookaround, possessive quantifiers, atomic groups. Backref patterns
+compile (the parser treats `\1` as a literal `1`) but never match
+back-reference semantics, so `re.re_test("^(a+)\\1$", "aaaa")` returns
+`false`. Don't rely on this — write portable patterns.
+
+### Sharp edges (Lua-specific)
+
+- **It's a Lua VM regex engine.** P7 (`a{0,10000}b$`) takes ~80 ms
+  here — fine functionally, slow versus native engines. The library's
+  hot paths don't use bounded quantifiers anywhere near that size.
+- **No catastrophic backtracking.** Thompson-NFA construction; P1/P2
+  finish in microseconds.
+- **Zero-width `re_replace`.** `re.re_replace("a*", "abc", "X")`
+  returns `"XXbXcX"` — the convention shared with PCRE/ECMA/Java/.NET
+  and the other in-tree Thompson ports (Rust / C / Zig). Go (RE2)
+  returns `"XbXcX"` instead. (Pre-fix the Lua engine produced
+  `"XaXbXcX"`; the `OP_MATCH` handler in `regex.lua` is now
+  priority-correct, matching the C port's fix.)
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/lua/src/regex.lua b/lua/src/regex.lua
index 2718ff7..65e5c18 100644
--- a/lua/src/regex.lua
+++ b/lua/src/regex.lua
@@ -483,7 +483,11 @@ local function match_at(re, input, ilen, start)
       elseif op == OP_CLASS then
         if c >= 0 and insn.data.cc[c] then add_thread(re, nxt, th.pc + 1, th.slots, sp + 1, input, ilen, visited) end
       elseif op == OP_MATCH then
-        if not found then found = th.slots end
+        -- Always overwrite: priority ordering means later MATCHes from
+        -- surviving (higher-priority) descendants in nxt should override
+        -- earlier matches from lower-priority threads. `if not found` made
+        -- greedy quantifiers behave lazily (e.g. `a*` on "abc" matched "").
+        found = th.slots
         break
       end
     end
@@ -491,10 +495,10 @@ local function match_at(re, input, ilen, start)
     sp = sp + 1
     if #cur == 0 then break end
   end
-  -- Drain remaining current threads for trailing MATCH.
+  -- Drain remaining current threads for trailing MATCH (mirrors C engine).
   for i = 1, #cur do
     if re.code[cur[i].pc].op == OP_MATCH then
-      if not found then found = cur[i].slots end
+      found = cur[i].slots
       break
     end
   end
diff --git a/lua/test/regex_pathological.lua b/lua/test/regex_pathological.lua
new file mode 100644
index 0000000..5207485
--- /dev/null
+++ b/lua/test/regex_pathological.lua
@@ -0,0 +1,62 @@
+-- Discovery test: pathological regex inputs run against the port's re_* API.
+-- Goal is to surface failures across ports, not to assert behaviour.
+-- Panel is the same in every port (see REGEX.md).
+--
+-- RUN: lua test/regex_pathological.lua
+
+package.path = "../src/?.lua;./src/?.lua;" .. (package.path or "")
+local re = require("regex")
+
+local function json_str(s)
+  return '"' .. tostring(s):gsub('"', '\\"') .. '"'
+end
+
+local function json_table(t)
+  local parts = {}
+  for _, v in ipairs(t) do
+    if type(v) == "table" then
+      parts[#parts + 1] = json_table(v)
+    elseif type(v) == "string" then
+      parts[#parts + 1] = json_str(v)
+    else
+      parts[#parts + 1] = tostring(v)
+    end
+  end
+  return "[" .. table.concat(parts, ",") .. "]"
+end
+
+local function render(r)
+  local t = type(r)
+  if t == "nil" then return "null"
+  elseif t == "boolean" then return tostring(r)
+  elseif t == "string" then return json_str(r)
+  elseif t == "table" then return json_table(r)
+  else return tostring(r) end
+end
+
+local function record(label, fn)
+  local t0 = os.clock()
+  local ok, r = pcall(fn)
+  local ms = (os.clock() - t0) * 1000.0
+  local outcome
+  if ok then
+    outcome = "OK | " .. render(r)
+  else
+    outcome = "ERR | " .. tostring(r)
+  end
+  io.write(string.format("[regex-discovery] %s | %.2fms | %s\n", label, ms, outcome))
+end
+
+local a22 = string.rep("a", 22)
+local nest40 = string.rep("(", 40) .. "a" .. string.rep(")", 40)
+
+record("P1_redos_nested_plus",      function() return re.re_test("^(a+)+$", a22 .. "!") end)
+record("P2_redos_alt_overlap",      function() return re.re_test("^(a|aa)+$", a22 .. "!") end)
+record("P3_empty_repeat_replace",   function() return re.re_replace("a*", "abc", "X") end)
+record("P4_unicode_replace_dot",    function() return re.re_replace("\\.", "café.au.lait", "/") end)
+record("P5_unicode_find_codepoint", function() return re.re_find("é", "café au lait") end)
+record("P6_deep_nesting_compile",   function() return re.re_test(nest40, "a") end)
+record("P7_big_bounded_quantifier", function() return re.re_test("^a{0,10000}b$", string.rep("a", 10) .. "b") end)
+record("P8_invalid_pattern",        function() return re.re_compile("[abc") end)
+record("P9_backref_re2_forbidden",  function() return re.re_test("^(a+)\\1$", "aaaa") end)
+record("P10_find_all_zero_width",   function() return re.re_find_all("a*", "bbb") end)
diff --git a/perl/README.md b/perl/README.md
index 17600a2..aca2e94 100644
--- a/perl/README.md
+++ b/perl/README.md
@@ -104,6 +104,44 @@ because they don't preserve insertion order.
 - Builder helpers: `jm` (insertion-ordered map literal), `jt`
   (list literal).
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The Perl port
+wraps Perl's built-in regex engine.
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `re_compile(pattern, flags?)`         | `qr/$pattern/` |
+| `re_test(pattern, input)`             | `$input =~ $re` |
+| `re_find(pattern, input)`             | first match as `[whole, $1, ...]` or `undef` |
+| `re_find_all(pattern, input)`         | all matches, one arrayref per match |
+| `re_replace(pattern, input, repl)`    | `s/$re/$repl/g` (callable or template) |
+| `re_escape(s)`                        | `quotemeta` equivalent |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+Perl's regex supports backreferences, lookaround, recursion — none of
+which are portable to the Go / Rust / C / Lua / Zig ports.
+
+### Sharp edges
+
+- **Catastrophic backtracking.** Perl's regex engine is backtracking
+  but ships with optimisations (trie engine for alternation, etc.).
+  The discovery panel runs P1/P2 in microseconds here, but other
+  pathological shapes can still blow up. Stay flat.
+- **Zero-width `replace`.** `re_replace("a*", "abc", "X")` returns
+  `"XXbXcX"` — the ECMA convention shared by all PCRE/ECMA/.NET/Java/Onigmo engines plus the in-tree Thompson ports. Go (RE2) returns `"XbXcX"` instead; see `/REGEX_PATHOLOGICAL.md`.
+- **UTF-8 handling.** Pass character strings (use `use utf8;` for
+  literals, or `decode_utf8` for bytes). Encoding round-trip bugs in
+  caller code can manifest as `cafÃ©` style mojibake at print time —
+  the regex itself preserves character semantics.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Tests
 
 ```bash
diff --git a/perl/t/regex_pathological.t b/perl/t/regex_pathological.t
new file mode 100644
index 0000000..904af82
--- /dev/null
+++ b/perl/t/regex_pathological.t
@@ -0,0 +1,55 @@
+#!perl
+# Discovery test: pathological regex inputs run against the port's re_* API.
+# Goal is to surface failures across ports, not to assert behaviour.
+# Panel is the same in every port (see REGEX.md).
+
+use 5.018;
+use strict;
+use warnings;
+use utf8;
+use Test::More;
+use FindBin;
+use lib "$FindBin::Bin/../lib";
+use Voxgig::Struct qw();
+use JSON::PP qw();
+use Time::HiRes qw(gettimeofday tv_interval);
+
+binmode STDOUT, ':encoding(UTF-8)';
+
+# JSON::PP defaults to UTF-8-encoding its output bytes. We want characters
+# so STDOUT's :utf8 layer can encode them once (not twice).
+my $JSON = JSON::PP->new->utf8(0);
+
+sub record {
+    my ($label, $fn) = @_;
+    my $t0 = [gettimeofday];
+    my $outcome;
+    my $r = eval { $fn->() };
+    if (my $err = $@) {
+        chomp $err;
+        $outcome = "ERR | $err";
+    } else {
+        my $enc = eval { $JSON->encode($r) };
+        $enc = (defined $r ? "$r" : 'null') if $@;
+        $outcome = "OK | $enc";
+    }
+    my $ms = tv_interval($t0) * 1000.0;
+    printf("[regex-discovery] %s | %.2fms | %s\n", $label, $ms, $outcome);
+}
+
+my $a22    = 'a' x 22;
+my $nest40 = ('(' x 40) . 'a' . (')' x 40);
+
+record('P1_redos_nested_plus',      sub { Voxgig::Struct::re_test('^(a+)+$', $a22 . '!') });
+record('P2_redos_alt_overlap',      sub { Voxgig::Struct::re_test('^(a|aa)+$', $a22 . '!') });
+record('P3_empty_repeat_replace',   sub { Voxgig::Struct::re_replace('a*', 'abc', 'X') });
+record('P4_unicode_replace_dot',    sub { Voxgig::Struct::re_replace('\\.', 'café.au.lait', '/') });
+record('P5_unicode_find_codepoint', sub { Voxgig::Struct::re_find('é', 'café au lait') });
+record('P6_deep_nesting_compile',   sub { Voxgig::Struct::re_test($nest40, 'a') });
+record('P7_big_bounded_quantifier', sub { Voxgig::Struct::re_test('^a{0,10000}b$', ('a' x 10) . 'b') });
+record('P8_invalid_pattern',        sub { Voxgig::Struct::re_compile('[abc') });
+record('P9_backref_re2_forbidden',  sub { Voxgig::Struct::re_test('^(a+)\\1$', 'aaaa') });
+record('P10_find_all_zero_width',   sub { Voxgig::Struct::re_find_all('a*', 'bbb') });
+
+pass('regex pathological discovery ran');
+done_testing();
diff --git a/php/README.md b/php/README.md
index f709bfc..0874f03 100644
--- a/php/README.md
+++ b/php/README.md
@@ -362,6 +362,46 @@ PHP method names match canonical lowercase: `getpath`, `setpath`,
 82/82 tests pass, 920 assertions.
 
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The PHP port
+wraps PCRE (`preg_*`).
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `re_compile(pattern)`              | delimited PCRE pattern (validated via `preg_match`) |
+| `re_test(pattern, input)`          | `preg_match` → bool |
+| `re_find(pattern, input)`          | `preg_match` with captures, returns `[whole, group1, ...]` or `null` |
+| `re_find_all(pattern, input)`      | `preg_match_all(..., PREG_SET_ORDER)` |
+| `re_replace(pattern, input, repl)` | `preg_replace` (or `preg_replace_callback` for callable repl) |
+| `re_escape(s)`                     | `preg_quote(s)` equivalent |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+PCRE supports backreferences and lookaround; using them will not be
+portable.
+
+### Sharp edges
+
+- **`re_compile` validates eagerly.** Invalid patterns throw
+  `InvalidArgumentException` at compile time. This is a recent fix:
+  the wrapper used to swallow PCRE warnings via `@preg_match` and
+  return `false` silently from `re_test`/`re_find`. Callers can now
+  distinguish "no match" from "bad pattern".
+- **Catastrophic backtracking.** PCRE is a backtracking engine but has
+  a JIT and a backtrack limit; the discovery panel runs P1/P2 in a few
+  ms here. Larger inputs or pathological shapes can hit
+  `pcre.backtrack_limit` and return `false`. Stay inside the RE2 subset
+  and prefer flat patterns.
+- **Zero-width `replace`.** `re_replace("a*", "abc", "X")` returns
+  `"XXbXcX"` — the ECMA convention shared by all PCRE/ECMA/.NET/Java/Onigmo engines plus the in-tree Thompson ports. Go (RE2) returns `"XbXcX"` instead; see `/REGEX_PATHOLOGICAL.md`.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/php/src/Struct.php b/php/src/Struct.php
index d286578..cafe059 100644
--- a/php/src/Struct.php
+++ b/php/src/Struct.php
@@ -565,20 +565,25 @@ public static function escre(?string $s): string
     public static function re_compile(string $pattern): string
     {
         // PHP wants a delimited pattern; return one delimited with '/'.
-        if (strlen($pattern) > 0 && $pattern[0] === '/') {
-            return $pattern;
+        $delimited = strlen($pattern) > 0 && $pattern[0] === '/'
+            ? $pattern
+            : '/' . str_replace('/', '\\/', $pattern) . '/';
+        // PCRE returns false from preg_match on invalid patterns; surface that
+        // to the caller (matching the throw behaviour of JS/Python/Java/.NET).
+        if (@preg_match($delimited, '') === false) {
+            throw new \InvalidArgumentException("Invalid regex pattern: $pattern");
         }
-        return '/' . str_replace('/', '\\/', $pattern) . '/';
+        return $delimited;
     }
 
     public static function re_test(string $pattern, string $input): bool
     {
-        return @preg_match(self::re_compile($pattern), $input) === 1;
+        return preg_match(self::re_compile($pattern), $input) === 1;
     }
 
     public static function re_find(string $pattern, string $input): ?array
     {
-        if (@preg_match(self::re_compile($pattern), $input, $m) === 1) {
+        if (preg_match(self::re_compile($pattern), $input, $m) === 1) {
             return $m;
         }
         return null;
@@ -587,7 +592,7 @@ public static function re_find(string $pattern, string $input): ?array
     public static function re_find_all(string $pattern, string $input): array
     {
         $out = [];
-        if (@preg_match_all(self::re_compile($pattern), $input, $m, PREG_SET_ORDER) !== false) {
+        if (preg_match_all(self::re_compile($pattern), $input, $m, PREG_SET_ORDER) !== false) {
             $out = $m;
         }
         return $out;
diff --git a/php/tests/RegexPathologicalTest.php b/php/tests/RegexPathologicalTest.php
new file mode 100644
index 0000000..54be3b3
--- /dev/null
+++ b/php/tests/RegexPathologicalTest.php
@@ -0,0 +1,45 @@
+<?php
+
+// Discovery test: pathological regex inputs run against the port's re_* API.
+// Goal is to surface failures across ports, not to assert behaviour.
+// Panel is the same in every port (see REGEX.md).
+
+require_once __DIR__ . '/../src/Struct.php';
+
+use PHPUnit\Framework\TestCase;
+use Voxgig\Struct\Struct;
+
+class RegexPathologicalTest extends TestCase
+{
+    private static function record(string $label, callable $fn): void
+    {
+        $t0 = hrtime(true);
+        try {
+            $r = $fn();
+            $outcome = 'OK | ' . json_encode($r, JSON_UNESCAPED_UNICODE);
+        } catch (\Throwable $e) {
+            $outcome = 'ERR | ' . get_class($e) . ': ' . $e->getMessage();
+        }
+        $ms = (hrtime(true) - $t0) / 1e6;
+        printf("[regex-discovery] %s | %.2fms | %s\n", $label, $ms, $outcome);
+    }
+
+    public function testPanel(): void
+    {
+        $a22    = str_repeat('a', 22);
+        $nest40 = str_repeat('(', 40) . 'a' . str_repeat(')', 40);
+
+        self::record('P1_redos_nested_plus', fn() => Struct::re_test('^(a+)+$', $a22 . '!'));
+        self::record('P2_redos_alt_overlap', fn() => Struct::re_test('^(a|aa)+$', $a22 . '!'));
+        self::record('P3_empty_repeat_replace', fn() => Struct::re_replace('a*', 'abc', 'X'));
+        self::record('P4_unicode_replace_dot', fn() => Struct::re_replace('\\.', 'café.au.lait', '/'));
+        self::record('P5_unicode_find_codepoint', fn() => Struct::re_find('é', 'café au lait'));
+        self::record('P6_deep_nesting_compile', fn() => Struct::re_test($nest40, 'a'));
+        self::record('P7_big_bounded_quantifier', fn() => Struct::re_test('^a{0,10000}b$', str_repeat('a', 10) . 'b'));
+        self::record('P8_invalid_pattern', fn() => Struct::re_compile('[abc'));
+        self::record('P9_backref_re2_forbidden', fn() => Struct::re_test('^(a+)\\1$', 'aaaa'));
+        self::record('P10_find_all_zero_width', fn() => Struct::re_find_all('a*', 'bbb'));
+
+        $this->assertTrue(true);
+    }
+}
diff --git a/python/README.md b/python/README.md
index 1797fcc..b7e7300 100644
--- a/python/README.md
+++ b/python/README.md
@@ -361,6 +361,39 @@ parity with other ports beats style here.
 84/84 tests pass against the shared corpus.
 
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The Python port
+wraps the stdlib `re` module.
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `re_compile(pattern, flags=0)`         | `re.compile(pattern, flags)` |
+| `re_test(pattern, input)`              | `bool(re.search(pattern, input))` |
+| `re_find(pattern, input)`              | first match as `[whole, group1, ...]` or `None` |
+| `re_find_all(pattern, input)`          | all matches, one row per match |
+| `re_replace(pattern, input, repl)`     | `re.sub(pattern, repl, input)` |
+| `re_escape(s)`                         | `re.escape(s)` |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+Python's `re` supports backreferences and lookaround; using them will
+not be portable to the Go / Rust / C / Lua / Zig ports.
+
+### Sharp edges
+
+- **Catastrophic backtracking.** Python's `re` (the default C engine)
+  is backtracking. `^(a+)+$` against 22 a's plus `!` runs ~190 ms here;
+  RE2-style ports finish the same case in <0.1 ms. Use flat patterns.
+- **Zero-width `replace`.** `re_replace("a*", "abc", "X")` returns
+  `"XXbXcX"` — the ECMA convention shared by all PCRE/ECMA/.NET/Java/Onigmo engines plus the in-tree Thompson ports. Go (RE2) returns `"XbXcX"` instead; see `/REGEX_PATHOLOGICAL.md`.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/python/tests/test_regex_pathological.py b/python/tests/test_regex_pathological.py
new file mode 100644
index 0000000..2ce9c5f
--- /dev/null
+++ b/python/tests/test_regex_pathological.py
@@ -0,0 +1,51 @@
+# RUN: python -m unittest discover -s tests
+#
+# Discovery test: pathological regex inputs run against the port's re_* API.
+# The goal is to surface which inputs cause errors, hangs, or surprising
+# output across ports — NOT to assert any specific behaviour. Each case
+# wraps the call so one failure does not mask the others.
+# The panel is the same in every port (see REGEX.md).
+
+import json
+import time
+import unittest
+
+from voxgig_struct.voxgig_struct import (
+    re_compile,
+    re_find,
+    re_find_all,
+    re_replace,
+    re_test,
+)
+
+
+def record(label, fn):
+    t0 = time.perf_counter()
+    try:
+        r = fn()
+        outcome = f'OK | {json.dumps(r, default=str)}'
+    except Exception as e:
+        outcome = f'ERR | {type(e).__name__}: {e}'
+    ms = (time.perf_counter() - t0) * 1000.0
+    print(f'[regex-discovery] {label} | {ms:.2f}ms | {outcome}')
+
+
+class PathologicalRegex(unittest.TestCase):
+    def test_panel(self):
+        A22 = 'a' * 22
+        NEST40 = '(' * 40 + 'a' + ')' * 40
+
+        record('P1_redos_nested_plus', lambda: re_test('^(a+)+$', A22 + '!'))
+        record('P2_redos_alt_overlap', lambda: re_test('^(a|aa)+$', A22 + '!'))
+        record('P3_empty_repeat_replace', lambda: re_replace('a*', 'abc', 'X'))
+        record('P4_unicode_replace_dot', lambda: re_replace(r'\.', 'café.au.lait', '/'))
+        record('P5_unicode_find_codepoint', lambda: re_find('é', 'café au lait'))
+        record('P6_deep_nesting_compile', lambda: re_test(NEST40, 'a'))
+        record('P7_big_bounded_quantifier', lambda: re_test('^a{0,10000}b$', 'a' * 10 + 'b'))
+        record('P8_invalid_pattern', lambda: re_compile('[abc'))
+        record('P9_backref_re2_forbidden', lambda: re_test(r'^(a+)\1$', 'aaaa'))
+        record('P10_find_all_zero_width', lambda: re_find_all('a*', 'bbb'))
+
+
+if __name__ == '__main__':
+    unittest.main()
diff --git a/ruby/README.md b/ruby/README.md
index 718cff1..20f36ac 100644
--- a/ruby/README.md
+++ b/ruby/README.md
@@ -345,6 +345,41 @@ and a `maxdepth` parameter, matching the canonical algorithm.
 75/75 tests pass, 150 assertions.
 
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The Ruby port
+wraps the built-in `Regexp` (Onigmo engine).
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `re_compile(pattern)`              | `Regexp.new(pattern)` |
+| `re_test(pattern, input)`          | `input =~ re` |
+| `re_find(pattern, input)`          | `input.match(re)` → `[whole, group1, ...]` |
+| `re_find_all(pattern, input)`      | `input.scan(re)` (one row per match) |
+| `re_replace(pattern, input, repl)` | `input.gsub(re, repl)` |
+| `re_escape(s)`                     | `Regexp.escape(s)` |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+Onigmo supports backreferences and lookaround; using them will not be
+portable to the Go / Rust / C / Lua / Zig ports.
+
+### Sharp edges
+
+- **Catastrophic backtracking.** Onigmo has internal mitigations for
+  some classic ReDoS shapes — `^(a+)+$` against 22 a's plus `!` runs
+  in microseconds here. Larger inputs or different shapes can still
+  blow up; the safe rule is to stay inside the RE2 subset and avoid
+  nested quantifiers.
+- **Zero-width `replace`.** `re_replace("a*", "abc", "X")` returns
+  `"XXbXcX"` — the ECMA convention shared by all PCRE/ECMA/.NET/Java/Onigmo engines plus the in-tree Thompson ports. Go (RE2) returns `"XbXcX"` instead; see `/REGEX_PATHOLOGICAL.md`.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/ruby/test_regex_pathological.rb b/ruby/test_regex_pathological.rb
new file mode 100644
index 0000000..1fde671
--- /dev/null
+++ b/ruby/test_regex_pathological.rb
@@ -0,0 +1,37 @@
+require 'minitest/autorun'
+require 'json'
+require_relative 'voxgig_struct'
+
+# Discovery test: pathological regex inputs run against the port's re_* API.
+# Goal is to surface failures across ports, not to assert behaviour.
+# Panel is the same in every port (see REGEX.md).
+
+def record(label, &block)
+  t0 = Process.clock_gettime(Process::CLOCK_MONOTONIC)
+  begin
+    r = block.call
+    outcome = "OK | #{JSON.generate(r)}"
+  rescue StandardError => e
+    outcome = "ERR | #{e.class.name}: #{e.message}"
+  end
+  ms = (Process.clock_gettime(Process::CLOCK_MONOTONIC) - t0) * 1000.0
+  printf("[regex-discovery] %s | %.2fms | %s\n", label, ms, outcome)
+end
+
+class PathologicalRegexTest < Minitest::Test
+  def test_panel
+    a22 = 'a' * 22
+    nest40 = "#{'(' * 40}a#{')' * 40}"
+
+    record('P1_redos_nested_plus')      { VoxgigStruct.re_test('^(a+)+$', "#{a22}!") }
+    record('P2_redos_alt_overlap')      { VoxgigStruct.re_test('^(a|aa)+$', "#{a22}!") }
+    record('P3_empty_repeat_replace')   { VoxgigStruct.re_replace('a*', 'abc', 'X') }
+    record('P4_unicode_replace_dot')    { VoxgigStruct.re_replace('\\.', 'café.au.lait', '/') }
+    record('P5_unicode_find_codepoint') { VoxgigStruct.re_find('é', 'café au lait') }
+    record('P6_deep_nesting_compile')   { VoxgigStruct.re_test(nest40, 'a') }
+    record('P7_big_bounded_quantifier') { VoxgigStruct.re_test('^a{0,10000}b$', "#{'a' * 10}b") }
+    record('P8_invalid_pattern')        { VoxgigStruct.re_compile('[abc') }
+    record('P9_backref_re2_forbidden')  { VoxgigStruct.re_test('^(a+)\\1$', 'aaaa') }
+    record('P10_find_all_zero_width')   { VoxgigStruct.re_find_all('a*', 'bbb') }
+  end
+end
diff --git a/rust/README.md b/rust/README.md
index b837949..4bb3ba9 100644
--- a/rust/README.md
+++ b/rust/README.md
@@ -126,3 +126,55 @@ Rust has no optional/overloaded parameters, so:
 
 See [`REPORT.md`](../REPORT.md#rust-rust) for the rust-port adaptations
 write-up, and [`../NOTES.md`](../NOTES.md) for cross-port quirks.
+
+
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The Rust port
+**ships its own RE2-subset engine** in `src/re.rs` — no `regex` crate
+dependency, no third-party crates at all (`Cargo.toml` lists none for
+runtime).
+
+### API
+
+| Function | Returns |
+|---|---|
+| `re_compile(pattern)`           | `Result<Regex, RegexError>` |
+| `re_test(pattern, input)`       | `bool` |
+| `re_find(pattern, input)`       | `Option<Vec<String>>` — `[whole, group1, …]` |
+| `re_find_all(pattern, input)`   | `Vec<Vec<String>>` |
+| `re_replace(pattern, input, r)` | `String` |
+| `re_escape(s)`                  | `String` |
+
+### Dialect
+
+The in-tree engine implements the RE2 subset documented in
+`/REGEX.md`: literals + escapes, `.`, `^`/`$`, `* + ? {n} {n,} {n,m}`
+(greedy + lazy), classes incl. `\d \w \s` and friends, `\b`/`\B`,
+`(...)` / `(?:...)`, alternation.
+
+**Not supported** (by design — RE2 doesn't either):
+backreferences, lookaround, possessive quantifiers, atomic groups.
+Backref patterns like `^(a+)\1$` *compile* (the parser doesn't reject
+`\1`) but never match the back-reference semantically, so `re_test`
+returns `false` rather than erroring. Don't rely on this — write
+portable patterns.
+
+### Sharp edges (Rust-specific)
+
+- **Bounded quantifiers are unrolled.** `a{0,10000}` compiles into
+  10 000 Split+atom-clone pairs. The matcher was previously recursive
+  during epsilon-closure and stack-overflowed on such patterns; it is
+  now iterative (`Threads::add` uses an explicit work stack).
+  `re_test("^a{0,10000}b$", …)` now runs in ~10 ms here.
+- **No catastrophic backtracking.** Thompson-NFA construction means
+  P1/P2 from the discovery panel run in microseconds.
+- **Zero-width `re_replace`.** `re_replace("a*", "abc", "X")` returns
+  `"XXbXcX"` — the convention shared with all PCRE/ECMA/Java/.NET
+  engines and the other in-tree Thompson ports (C / Lua / Zig). Go
+  (RE2) returns `"XbXcX"` instead; see `/REGEX_PATHOLOGICAL.md`.
+- **Single-threaded.** `Value` uses `Rc<RefCell<…>>` so it is
+  `!Send + !Sync`. The regex statics use `std::sync::LazyLock` and
+  are thread-safe in isolation, but the public API isn't.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
diff --git a/rust/src/re.rs b/rust/src/re.rs
index 4f06b0c..a7669e7 100644
--- a/rust/src/re.rs
+++ b/rust/src/re.rs
@@ -852,62 +852,68 @@ impl ThreadList {
     }
 
     fn add(&mut self, re: &Regex, input: &[u8], pc: usize, slots: &[i32], sp: usize) {
-        if pc >= re.code.len() {
-            return;
-        }
-        if self.visited[pc] == self.gen {
-            return;
-        }
-        self.visited[pc] = self.gen;
-        let insn = &re.code[pc];
-        match insn.op {
-            Op::Jmp(t) => {
-                self.add(re, input, t as usize, slots, sp);
-                return;
-            }
-            Op::Split(x, y) => {
-                self.add(re, input, x as usize, slots, sp);
-                self.add(re, input, y as usize, slots, sp);
-                return;
+        // Iterative epsilon-closure: we walk Jmp/Split/Save/Bol/Eol/Wb/Nwb
+        // until we hit a char-consuming op or Match. A recursive version
+        // overflows the stack on long Thompson chains (e.g. `a{0,10000}`
+        // unrolls into 10000 chained Splits — `cargo test` aborted with
+        // SIGABRT on the pathological-regex panel before this loop landed).
+        //
+        // The stack mirrors the recursive order: Split pushes y first then
+        // x, so x is processed first (priority preserved).
+        let mut stack: Vec<(usize, Vec<i32>)> = vec![(pc, slots.to_vec())];
+        while let Some((cur_pc, cur_slots)) = stack.pop() {
+            if cur_pc >= re.code.len() {
+                continue;
             }
-            Op::Save(slot) => {
-                let mut ns = slots.to_vec();
-                ns[slot] = sp as i32;
-                self.add(re, input, pc + 1, &ns, sp);
-                return;
+            if self.visited[cur_pc] == self.gen {
+                continue;
             }
-            Op::Bol => {
-                if sp == 0 || (sp - 1 < input.len() && input[sp - 1] == b'\n') {
-                    self.add(re, input, pc + 1, slots, sp);
+            self.visited[cur_pc] = self.gen;
+            match re.code[cur_pc].op {
+                Op::Jmp(t) => {
+                    stack.push((t as usize, cur_slots));
                 }
-                return;
-            }
-            Op::Eol => {
-                if sp >= input.len() || input[sp] == b'\n' {
-                    self.add(re, input, pc + 1, slots, sp);
+                Op::Split(x, y) => {
+                    // Push y first so x (higher priority) is popped first.
+                    stack.push((y as usize, cur_slots.clone()));
+                    stack.push((x as usize, cur_slots));
                 }
-                return;
-            }
-            Op::Wb | Op::Nwb => {
-                let left = sp > 0
-                    && sp - 1 < input.len()
-                    && (input[sp - 1].is_ascii_alphanumeric() || input[sp - 1] == b'_');
-                let right =
-                    sp < input.len() && (input[sp].is_ascii_alphanumeric() || input[sp] == b'_');
-                let at_boundary = left != right;
-                let want = matches!(insn.op, Op::Wb);
-                if at_boundary == want {
-                    self.add(re, input, pc + 1, slots, sp);
+                Op::Save(slot) => {
+                    let mut ns = cur_slots;
+                    ns[slot] = sp as i32;
+                    stack.push((cur_pc + 1, ns));
+                }
+                Op::Bol => {
+                    if sp == 0 || (sp - 1 < input.len() && input[sp - 1] == b'\n') {
+                        stack.push((cur_pc + 1, cur_slots));
+                    }
+                }
+                Op::Eol => {
+                    if sp >= input.len() || input[sp] == b'\n' {
+                        stack.push((cur_pc + 1, cur_slots));
+                    }
+                }
+                Op::Wb | Op::Nwb => {
+                    let left = sp > 0
+                        && sp - 1 < input.len()
+                        && (input[sp - 1].is_ascii_alphanumeric() || input[sp - 1] == b'_');
+                    let right = sp < input.len()
+                        && (input[sp].is_ascii_alphanumeric() || input[sp] == b'_');
+                    let at_boundary = left != right;
+                    let want = matches!(re.code[cur_pc].op, Op::Wb);
+                    if at_boundary == want {
+                        stack.push((cur_pc + 1, cur_slots));
+                    }
+                }
+                _ => {
+                    // Char-consuming op (or Match): queue thread.
+                    self.threads.push(Thread {
+                        pc: cur_pc,
+                        slots: cur_slots,
+                    });
                 }
-                return;
             }
-            _ => {}
         }
-        // Char-consuming op: queue thread.
-        self.threads.push(Thread {
-            pc,
-            slots: slots.to_vec(),
-        });
     }
 }
 
diff --git a/rust/tests/regex_pathological.rs b/rust/tests/regex_pathological.rs
new file mode 100644
index 0000000..c82f7ac
--- /dev/null
+++ b/rust/tests/regex_pathological.rs
@@ -0,0 +1,61 @@
+// Discovery test: pathological regex inputs run against the port's re_* API.
+// Goal is to surface failures across ports, not to assert behaviour.
+// Panel is the same in every port (see REGEX.md).
+
+use std::panic;
+use std::time::Instant;
+
+use voxgig_struct::{re_compile, re_find, re_find_all, re_replace, re_test};
+
+fn record<F, R>(label: &str, fn_: F)
+where
+    F: FnOnce() -> R + panic::UnwindSafe,
+    R: std::fmt::Debug,
+{
+    let t0 = Instant::now();
+    let outcome = match panic::catch_unwind(fn_) {
+        Ok(r) => format!("OK | {:?}", r),
+        Err(e) => {
+            let msg = if let Some(s) = e.downcast_ref::<&str>() {
+                s.to_string()
+            } else if let Some(s) = e.downcast_ref::<String>() {
+                s.clone()
+            } else {
+                "<non-string panic>".to_string()
+            };
+            format!("ERR | panic: {}", msg)
+        }
+    };
+    let ms = t0.elapsed().as_secs_f64() * 1000.0;
+    println!("[regex-discovery] {} | {:.2}ms | {}", label, ms, outcome);
+}
+
+#[test]
+fn regex_pathological_discovery() {
+    let a22: String = "a".repeat(22);
+    let nest40: String = "(".repeat(40) + "a" + &")".repeat(40);
+
+    record("P1_redos_nested_plus", || {
+        re_test("^(a+)+$", &(a22.clone() + "!"))
+    });
+    record("P2_redos_alt_overlap", || {
+        re_test("^(a|aa)+$", &(a22.clone() + "!"))
+    });
+    record("P3_empty_repeat_replace", || re_replace("a*", "abc", "X"));
+    record("P4_unicode_replace_dot", || {
+        re_replace(r"\.", "café.au.lait", "/")
+    });
+    record("P5_unicode_find_codepoint", || re_find("é", "café au lait"));
+    record("P6_deep_nesting_compile", || re_test(&nest40, "a"));
+    record("P7_big_bounded_quantifier", || {
+        re_test("^a{0,10000}b$", &("a".repeat(10) + "b"))
+    });
+    record("P8_invalid_pattern", || {
+        re_compile("[abc")
+            .map(|_| ())
+            .err()
+            .map(|e| format!("{:?}", e))
+    });
+    record("P9_backref_re2_forbidden", || re_test(r"^(a+)\1$", "aaaa"));
+    record("P10_find_all_zero_width", || re_find_all("a*", "bbb"));
+}
diff --git a/swift/README.md b/swift/README.md
index 9243e85..4e59a11 100644
--- a/swift/README.md
+++ b/swift/README.md
@@ -145,6 +145,42 @@ order. `JSON.stringify(value, indent: 2)` serialises back.
   `__NULL__` round-trip for the `inject.string` and `select.*`
   sets exactly as the canonical TS runner does.
 
+## Regex
+
+Uniform six-function regex API (see `/REGEX_API.md`). The Swift port
+wraps `NSRegularExpression`.
+
+### API
+
+| Function | Returns |
+|---|---|
+| `re_compile(pattern, flags?)`         | `NSRegularExpression?` (nil on bad pattern) |
+| `re_test(pattern, input)`             | `Bool` |
+| `re_find(pattern, input)`             | `Value.list([whole, group1, …])` or `.noval` |
+| `re_find_all(pattern, input)`         | `Value.list([...])` |
+| `re_replace(pattern, input, repl)`    | `String` |
+| `re_escape(v)`                        | `String` |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`.
+`NSRegularExpression` (ICU-based) supports backreferences and lookaround;
+using them will not be portable.
+
+### Sharp edges
+
+- **Catastrophic backtracking.** ICU regex is backtracking. Stay
+  inside the RE2 subset and prefer flat patterns.
+- **Compile failures are nil, not throws.** `re_compile` returns
+  `nil` on bad pattern (the underlying `try?` swallows the error).
+  Callers should check the optional rather than rely on an exception.
+- **`Value` shape for `re_find` / `re_find_all`.** The Swift port
+  threads results through the in-tree `Value` enum (matching the
+  rest of the API surface), not raw arrays.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Tests
 
 ```bash
diff --git a/swift/Tests/VoxgigStructTests/RegexPathologicalTests.swift b/swift/Tests/VoxgigStructTests/RegexPathologicalTests.swift
new file mode 100644
index 0000000..390182e
--- /dev/null
+++ b/swift/Tests/VoxgigStructTests/RegexPathologicalTests.swift
@@ -0,0 +1,40 @@
+// Discovery test: pathological regex inputs run against the port's re_* API.
+// Goal is to surface failures across ports, not to assert behaviour.
+// Panel is the same in every port (see REGEX.md).
+
+import XCTest
+
+@testable import VoxgigStruct
+
+final class RegexPathologicalTests: XCTestCase {
+  private func record(_ label: String, _ fn: () -> Any?) {
+    let t0 = DispatchTime.now()
+    let value = fn()
+    let elapsedNs = DispatchTime.now().uptimeNanoseconds - t0.uptimeNanoseconds
+    let ms = Double(elapsedNs) / 1_000_000.0
+    let outcome: String
+    if let value = value {
+      outcome = "OK | \(value)"
+    } else {
+      outcome = "OK | null"
+    }
+    print(String(format: "[regex-discovery] %@ | %.2fms | %@", label, ms, outcome))
+  }
+
+  func testPanel() {
+    let a22 = String(repeating: "a", count: 22)
+    let nest40 = String(repeating: "(", count: 40) + "a" + String(repeating: ")", count: 40)
+    let p7Input = String(repeating: "a", count: 10) + "b"
+
+    record("P1_redos_nested_plus") { re_test(.string("^(a+)+$"), a22 + "!") }
+    record("P2_redos_alt_overlap") { re_test(.string("^(a|aa)+$"), a22 + "!") }
+    record("P3_empty_repeat_replace") { re_replace(.string("a*"), "abc", "X") }
+    record("P4_unicode_replace_dot") { re_replace(.string("\\."), "café.au.lait", "/") }
+    record("P5_unicode_find_codepoint") { re_find(.string("é"), "café au lait") }
+    record("P6_deep_nesting_compile") { re_test(.string(nest40), "a") }
+    record("P7_big_bounded_quantifier") { re_test(.string("^a{0,10000}b$"), p7Input) }
+    record("P8_invalid_pattern") { re_compile("[abc") as Any? }
+    record("P9_backref_re2_forbidden") { re_test(.string("^(a+)\\1$"), "aaaa") }
+    record("P10_find_all_zero_width") { re_find_all(.string("a*"), "bbb") }
+  }
+}
diff --git a/typescript/README.md b/typescript/README.md
index af759eb..56ebd4c 100644
--- a/typescript/README.md
+++ b/typescript/README.md
@@ -563,6 +563,53 @@ calls (one shared array per depth).  Clone it (`path.slice()`) if
 you need to retain it past the callback.
 
 
+## Regex
+
+The library exposes a uniform six-function regex API across every
+port (see `/REGEX_API.md` for the contract and `/REGEX.md` for the
+supported dialect). On TypeScript the canonical implementation is
+ECMAScript `RegExp`.
+
+### API
+
+| Function | Maps to |
+|---|---|
+| `re_compile(pattern, flags?)`     | `new RegExp(pattern, flags ?? 'g')` |
+| `re_test(pattern, input)`         | `pattern.test(input)` |
+| `re_find(pattern, input)`         | `input.match(pattern)` (non-global pattern) |
+| `re_find_all(pattern, input)`     | `[...input.matchAll(pattern)]` |
+| `re_replace(pattern, input, rep)` | `input.replace(pattern, rep)` (global pattern) |
+| `re_escape(s)`                    | escape `[.*+?^${}()|[\]\\]` in `s` |
+
+### Dialect
+
+Patterns must stay inside the **RE2 subset** documented in `/REGEX.md`:
+literals + escapes, `.`, `^`/`$`, `* + ? {n} {n,} {n,m}` (greedy + lazy),
+character classes incl. `\d \w \s` etc., `\b`/`\B`, `(...)` / `(?:...)` /
+`(?<name>...)`, alternation. ECMAScript `RegExp` supports backreferences
+and lookaround, but other ports do not — using those will not be
+portable.
+
+### Sharp edges
+
+- **Catastrophic backtracking.** ECMAScript `RegExp` uses backtracking;
+  nested quantifiers (e.g. `(a+)+`) against a non-matching suffix can be
+  exponential in the input length. The discovery panel measures ~180 ms
+  on Node 22 for `^(a+)+$` against 22 a's plus `!`. RE2-style engines
+  finish the same case in under 0.1 ms. Write linear-friendly patterns
+  (`a+` instead of `(a+)+`) and keep injected user input in
+  character classes, not in alternations.
+- **Zero-width `replace`.** `re_replace("a*", "abc", "X")` returns
+  `"XXbXcX"` here — the ECMA convention shared by every port whose
+  host engine is PCRE/ECMA/.NET/Java/Onigmo, plus the in-tree
+  Thompson NFA ports (Rust / C / Lua / Zig). Go is the exception:
+  RE2 returns `"XbXcX"`. Don't rely on cross-port identity of
+  zero-width replacement output — see `/REGEX_PATHOLOGICAL.md`.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input
+panel and per-port outcomes.
+
+
 ## Build and test
 
 ```bash
diff --git a/typescript/dist-test/regex_pathological.test.js b/typescript/dist-test/regex_pathological.test.js
new file mode 100644
index 0000000..50202df
--- /dev/null
+++ b/typescript/dist-test/regex_pathological.test.js
@@ -0,0 +1,40 @@
+"use strict";
+// VERSION: @voxgig/struct 0.1.0
+//
+// Discovery test: pathological regex inputs run against the port's re_* API.
+// Each case wraps the call so one failure does not mask the others.
+// The panel is the same in every port (see REGEX.md).
+Object.defineProperty(exports, "__esModule", { value: true });
+const node_test_1 = require("node:test");
+const StructUtility_1 = require("../dist/StructUtility");
+function rep(s, n) {
+    return new Array(n + 1).join(s);
+}
+function record(label, fn) {
+    const t0 = process.hrtime.bigint();
+    let outcome;
+    try {
+        const r = fn();
+        outcome = `OK | ${JSON.stringify(r)}`;
+    }
+    catch (e) {
+        outcome = `ERR | ${e && e.message ? e.message : String(e)}`;
+    }
+    const ms = Number(process.hrtime.bigint() - t0) / 1e6;
+    console.log(`[regex-discovery] ${label} | ${ms.toFixed(2)}ms | ${outcome}`);
+}
+(0, node_test_1.test)('regex pathological discovery', () => {
+    const A22 = rep('a', 22);
+    const NEST40 = rep('(', 40) + 'a' + rep(')', 40);
+    record('P1_redos_nested_plus', () => (0, StructUtility_1.re_test)('^(a+)+$', A22 + '!'));
+    record('P2_redos_alt_overlap', () => (0, StructUtility_1.re_test)('^(a|aa)+$', A22 + '!'));
+    record('P3_empty_repeat_replace', () => (0, StructUtility_1.re_replace)('a*', 'abc', 'X'));
+    record('P4_unicode_replace_dot', () => (0, StructUtility_1.re_replace)('\\.', 'café.au.lait', '/'));
+    record('P5_unicode_find_codepoint', () => (0, StructUtility_1.re_find)('é', 'café au lait'));
+    record('P6_deep_nesting_compile', () => (0, StructUtility_1.re_test)(NEST40, 'a'));
+    record('P7_big_bounded_quantifier', () => (0, StructUtility_1.re_test)('^a{0,10000}b$', rep('a', 10) + 'b'));
+    record('P8_invalid_pattern', () => (0, StructUtility_1.re_compile)('[abc'));
+    record('P9_backref_re2_forbidden', () => (0, StructUtility_1.re_test)('^(a+)\\1$', 'aaaa'));
+    record('P10_find_all_zero_width', () => (0, StructUtility_1.re_find_all)('a*', 'bbb'));
+});
+//# sourceMappingURL=regex_pathological.test.js.map
\ No newline at end of file
diff --git a/typescript/dist-test/regex_pathological.test.js.map b/typescript/dist-test/regex_pathological.test.js.map
new file mode 100644
index 0000000..79b1e70
--- /dev/null
+++ b/typescript/dist-test/regex_pathological.test.js.map
@@ -0,0 +1 @@
+{"version":3,"file":"regex_pathological.test.js","sourceRoot":"","sources":["../test/regex_pathological.test.ts"],"names":[],"mappings":";AAAA,gCAAgC;AAChC,EAAE;AACF,6EAA6E;AAC7E,oEAAoE;AACpE,sDAAsD;;AAEtD,yCAAgC;AAEhC,yDAA6F;AAE7F,SAAS,GAAG,CAAC,CAAS,EAAE,CAAS;IAC/B,OAAO,IAAI,KAAK,CAAC,CAAC,GAAG,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC,CAAA;AACjC,CAAC;AAED,SAAS,MAAM,CAAC,KAAa,EAAE,EAAiB;IAC9C,MAAM,EAAE,GAAG,OAAO,CAAC,MAAM,CAAC,MAAM,EAAE,CAAA;IAClC,IAAI,OAAe,CAAA;IACnB,IAAI,CAAC;QACH,MAAM,CAAC,GAAG,EAAE,EAAE,CAAA;QACd,OAAO,GAAG,QAAQ,IAAI,CAAC,SAAS,CAAC,CAAC,CAAC,EAAE,CAAA;IACvC,CAAC;IAAC,OAAO,CAAM,EAAE,CAAC;QAChB,OAAO,GAAG,SAAS,CAAC,IAAI,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,CAAA;IAC7D,CAAC;IACD,MAAM,EAAE,GAAG,MAAM,CAAC,OAAO,CAAC,MAAM,CAAC,MAAM,EAAE,GAAG,EAAE,CAAC,GAAG,GAAG,CAAA;IACrD,OAAO,CAAC,GAAG,CAAC,qBAAqB,KAAK,MAAM,EAAE,CAAC,OAAO,CAAC,CAAC,CAAC,QAAQ,OAAO,EAAE,CAAC,CAAA;AAC7E,CAAC;AAED,IAAA,gBAAI,EAAC,8BAA8B,EAAE,GAAG,EAAE;IACxC,MAAM,GAAG,GAAG,GAAG,CAAC,GAAG,EAAE,EAAE,CAAC,CAAA;IACxB,MAAM,MAAM,GAAG,GAAG,CAAC,GAAG,EAAE,EAAE,CAAC,GAAG,GAAG,GAAG,GAAG,CAAC,GAAG,EAAE,EAAE,CAAC,CAAA;IAEhD,MAAM,CAAC,sBAAsB,EAAE,GAAG,EAAE,CAAC,IAAA,uBAAO,EAAC,SAAS,EAAE,GAAG,GAAG,GAAG,CAAC,CAAC,CAAA;IACnE,MAAM,CAAC,sBAAsB,EAAE,GAAG,EAAE,CAAC,IAAA,uBAAO,EAAC,WAAW,EAAE,GAAG,GAAG,GAAG,CAAC,CAAC,CAAA;IACrE,MAAM,CAAC,yBAAyB,EAAE,GAAG,EAAE,CAAC,IAAA,0BAAU,EAAC,IAAI,EAAE,KAAK,EAAE,GAAG,CAAC,CAAC,CAAA;IACrE,MAAM,CAAC,wBAAwB,EAAE,GAAG,EAAE,CAAC,IAAA,0BAAU,EAAC,KAAK,EAAE,cAAc,EAAE,GAAG,CAAC,CAAC,CAAA;IAC9E,MAAM,CAAC,2BAA2B,EAAE,GAAG,EAAE,CAAC,IAAA,uBAAO,EAAC,GAAG,EAAE,cAAc,CAAC,CAAC,CAAA;IACvE,MAAM,CAAC,yBAAyB,EAAE,GAAG,EAAE,CAAC,IAAA,uBAAO,EAAC,MAAM,EAAE,GAAG,CAAC,CAAC,CAAA;IAC7D,MAAM,CAAC,2BAA2B,EAAE,GAAG,EAAE,CAAC,IAAA,uBAAO,EAAC,eAAe,EAAE,GAAG,CAAC,GAAG,EAAE,EAAE,CAAC,GAAG,GAAG,CAAC,CAAC,CAAA;IACvF,MAAM,CAAC,oBAAoB,EAAE,GAAG,EAAE,CAAC,IAAA,0BAAU,EAAC,MAAM,CAAC,CAAC,CAAA;IACtD,MAAM,CAAC,0BAA0B,EAAE,GAAG,EAAE,CAAC,IAAA,uBAAO,EAAC,WAAW,EAAE,MAAM,CAAC,CAAC,CAAA;IACtE,MAAM,CAAC,yBAAyB,EAAE,GAAG,EAAE,CAAC,IAAA,2BAAW,EAAC,IAAI,EAAE,KAAK,CAAC,CAAC,CAAA;AACnE,CAAC,CAAC,CAAA"}
\ No newline at end of file
diff --git a/typescript/test/regex_pathological.test.ts b/typescript/test/regex_pathological.test.ts
new file mode 100644
index 0000000..d1b18b9
--- /dev/null
+++ b/typescript/test/regex_pathological.test.ts
@@ -0,0 +1,42 @@
+// VERSION: @voxgig/struct 0.1.0
+//
+// Discovery test: pathological regex inputs run against the port's re_* API.
+// Each case wraps the call so one failure does not mask the others.
+// The panel is the same in every port (see REGEX.md).
+
+import { test } from 'node:test'
+
+import { re_compile, re_test, re_find, re_find_all, re_replace } from '../dist/StructUtility'
+
+function rep(s: string, n: number): string {
+  return new Array(n + 1).join(s)
+}
+
+function record(label: string, fn: () => unknown): void {
+  const t0 = process.hrtime.bigint()
+  let outcome: string
+  try {
+    const r = fn()
+    outcome = `OK | ${JSON.stringify(r)}`
+  } catch (e: any) {
+    outcome = `ERR | ${e && e.message ? e.message : String(e)}`
+  }
+  const ms = Number(process.hrtime.bigint() - t0) / 1e6
+  console.log(`[regex-discovery] ${label} | ${ms.toFixed(2)}ms | ${outcome}`)
+}
+
+test('regex pathological discovery', () => {
+  const A22 = rep('a', 22)
+  const NEST40 = rep('(', 40) + 'a' + rep(')', 40)
+
+  record('P1_redos_nested_plus', () => re_test('^(a+)+$', A22 + '!'))
+  record('P2_redos_alt_overlap', () => re_test('^(a|aa)+$', A22 + '!'))
+  record('P3_empty_repeat_replace', () => re_replace('a*', 'abc', 'X'))
+  record('P4_unicode_replace_dot', () => re_replace('\\.', 'café.au.lait', '/'))
+  record('P5_unicode_find_codepoint', () => re_find('é', 'café au lait'))
+  record('P6_deep_nesting_compile', () => re_test(NEST40, 'a'))
+  record('P7_big_bounded_quantifier', () => re_test('^a{0,10000}b$', rep('a', 10) + 'b'))
+  record('P8_invalid_pattern', () => re_compile('[abc'))
+  record('P9_backref_re2_forbidden', () => re_test('^(a+)\\1$', 'aaaa'))
+  record('P10_find_all_zero_width', () => re_find_all('a*', 'bbb'))
+})
diff --git a/zig/README.md b/zig/README.md
index f3fe854..bcd9d29 100644
--- a/zig/README.md
+++ b/zig/README.md
@@ -251,6 +251,59 @@ subsystems present) but the test corpus pass rate is being raised.
 60+ tests pass; see [`../REPORT.md`](../REPORT.md) for current status.
 
 
+## Regex
+
+Uniform regex API (see `/REGEX_API.md`). The Zig port **ships its own
+RE2-subset engine** in `src/regex.zig` (Thompson NFA), replacing the
+earlier `mvzr` dependency. No third-party runtime crates.
+
+### API
+
+| Function | Returns |
+|---|---|
+| `re_compile(pattern)`                          | `?ReCompiled` (nil on bad pattern) |
+| `re_test(pattern, input)`                      | `bool` |
+| `re_find(alloc, pattern, input)`               | `?[][]const u8` (caller frees) |
+| `re_find_all(alloc, pattern, input)`           | `?[][][]const u8` (caller frees both levels) |
+| `re_replace(alloc, pattern, input, repl)`      | `![]u8` (caller frees) |
+| `re_escape(alloc, s)`                          | `![]const u8` |
+
+`ReCompiled` is an alias for the engine's `Regex` type
+(`src/regex.zig`); it owns an instruction buffer and is released with
+`.deinit()`.
+
+### Dialect
+
+The in-tree engine implements the RE2 subset documented in `/REGEX.md`:
+literals + escapes, `.`, `^`/`$`, `* + ? {n} {n,} {n,m}` (greedy + lazy),
+classes incl. `\d \w \s` and friends, `\b`/`\B`, `(...)` / `(?:...)`,
+alternation.
+
+**Not supported** (by design — RE2 doesn't either): backreferences,
+lookaround, possessive quantifiers, atomic groups.
+
+### Sharp edges (Zig-specific)
+
+- **Allocator-explicit.** `re_test` and `re_compile` use
+  `std.heap.page_allocator` internally so callers don't have to pipe
+  one through every call; the find/find_all/replace wrappers ask for
+  one because they return caller-owned slices.
+- **`re_find` / `re_find_all` slices alias the input.** They are
+  valid only while `input` is alive. Copy if you need to retain past
+  the input's lifetime.
+- **`re_replace` takes the replacement literally** in the current
+  wrapper — no `$&`/`$1..` expansion. The engine's lower-level
+  callback variant gives full control.
+- **No catastrophic backtracking.** Thompson-NFA construction; P1/P2
+  finish in microseconds.
+- **Zero-width `re_replace`** matches the in-tree-Thompson and
+  PCRE/ECMA convention: `re_replace(alloc, "a*", "abc", "X")` returns
+  `"XXbXcX"`. Go (RE2) returns `"XbXcX"` instead; this is RE2's
+  chosen rule and we don't paper over it.
+
+See `/REGEX_PATHOLOGICAL.md` for the cross-port pathological-input panel.
+
+
 ## Build and test
 
 ```bash
diff --git a/zig/src/regex.zig b/zig/src/regex.zig
index 17ad86d..62d118c 100644
--- a/zig/src/regex.zig
+++ b/zig/src/regex.zig
@@ -133,7 +133,7 @@ pub const Regex = struct {
         return false;
     }
 
-    fn findFirst(self: Regex, input: []const u8) ?[]i32 {
+    pub fn findFirst(self: Regex, input: []const u8) ?[]i32 {
         var start: usize = 0;
         while (true) {
             if (self.matchAt(input, start)) |slots| return slots;
@@ -143,6 +143,16 @@ pub const Regex = struct {
         }
     }
 
+    pub fn findFrom(self: Regex, input: []const u8, from: usize) ?[]i32 {
+        var start: usize = from;
+        while (true) {
+            if (self.matchAt(input, start)) |slots| return slots;
+            if (self.anchored_start) return null;
+            if (start > input.len) return null;
+            start += 1;
+        }
+    }
+
     fn matchAt(self: Regex, input: []const u8, start: usize) ?[]i32 {
         const nslots = self.ngroups * 2;
         var cur = ThreadList.init(self.allocator, self.code.len) catch return null;
diff --git a/zig/src/struct.zig b/zig/src/struct.zig
index d4f124c..a1df63c 100644
--- a/zig/src/struct.zig
+++ b/zig/src/struct.zig
@@ -773,6 +773,102 @@ pub fn re_test(pattern: []const u8, input: []const u8) bool {
     return re.isMatch(input);
 }
 
+/// re_find — first match as `[whole, capture1, ...]`. Slices alias `input`,
+/// so the result is valid only while `input` is alive. Returns null on
+/// compile error or no-match. The outer slice and inner slices must be
+/// freed by the caller.
+pub fn re_find(allocator: Allocator, pattern: []const u8, input: []const u8) ?[][]const u8 {
+    var re = _re_engine.compile(std.heap.page_allocator, pattern) orelse return null;
+    defer re.deinit();
+    const slots = re.findFirst(input) orelse return null;
+    defer std.heap.page_allocator.free(slots);
+    const ngroups = re.ngroups;
+    const out = allocator.alloc([]const u8, ngroups) catch return null;
+    var g: usize = 0;
+    while (g < ngroups) : (g += 1) {
+        const s = slots[2 * g];
+        const e = slots[2 * g + 1];
+        if (s < 0 or e < s) {
+            out[g] = "";
+        } else {
+            out[g] = input[@as(usize, @intCast(s))..@as(usize, @intCast(e))];
+        }
+    }
+    return out;
+}
+
+/// re_find_all — every non-overlapping match. Caller owns the returned
+/// slice-of-slices and must free both levels.
+pub fn re_find_all(allocator: Allocator, pattern: []const u8, input: []const u8) ?[][][]const u8 {
+    var re = _re_engine.compile(std.heap.page_allocator, pattern) orelse return null;
+    defer re.deinit();
+    var rows = std.ArrayList([][]const u8).init(allocator);
+    defer rows.deinit();
+    var pos: usize = 0;
+    while (pos <= input.len) {
+        const slots = re.findFrom(input, pos) orelse break;
+        defer std.heap.page_allocator.free(slots);
+        const ngroups = re.ngroups;
+        const row = allocator.alloc([]const u8, ngroups) catch return null;
+        var g: usize = 0;
+        while (g < ngroups) : (g += 1) {
+            const s = slots[2 * g];
+            const e = slots[2 * g + 1];
+            if (s < 0 or e < s) {
+                row[g] = "";
+            } else {
+                row[g] = input[@as(usize, @intCast(s))..@as(usize, @intCast(e))];
+            }
+        }
+        rows.append(row) catch return null;
+        const mstart = @as(usize, @intCast(slots[0]));
+        const mend = @as(usize, @intCast(slots[1]));
+        if (mend == mstart) {
+            pos = mend + 1;
+        } else {
+            pos = mend;
+        }
+    }
+    return rows.toOwnedSlice() catch return null;
+}
+
+/// re_replace — replace every match in `input` with `replacement`. The
+/// replacement string is taken literally; $& / $1.. substitution is not
+/// expanded in this minimal wrapper (matches the engine's current shape).
+/// On zero-width match the current rune is emitted and we advance by one
+/// byte, mirroring the ECMAScript convention used by other ports.
+pub fn re_replace(allocator: Allocator, pattern: []const u8, input: []const u8, replacement: []const u8) ![]u8 {
+    var re = _re_engine.compile(std.heap.page_allocator, pattern) orelse {
+        return allocator.dupe(u8, input);
+    };
+    defer re.deinit();
+    var out = std.ArrayList(u8).init(allocator);
+    defer out.deinit();
+    var pos: usize = 0;
+    while (pos <= input.len) {
+        const slots = re.findFrom(input, pos) orelse {
+            try out.appendSlice(input[pos..]);
+            break;
+        };
+        defer std.heap.page_allocator.free(slots);
+        const mstart = @as(usize, @intCast(slots[0]));
+        const mend = @as(usize, @intCast(slots[1]));
+        try out.appendSlice(input[pos..mstart]);
+        try out.appendSlice(replacement);
+        if (mend == mstart) {
+            if (mstart < input.len) {
+                try out.append(input[mstart]);
+                pos = mstart + 1;
+            } else {
+                pos = mstart + 1;
+            }
+        } else {
+            pos = mend;
+        }
+    }
+    return out.toOwnedSlice();
+}
+
 pub fn re_escape(allocator: Allocator, s: []const u8) ![]const u8 {
     return escre(allocator, s);
 }
diff --git a/zig/test/regex_pathological.zig b/zig/test/regex_pathological.zig
new file mode 100644
index 0000000..5ae19cd
--- /dev/null
+++ b/zig/test/regex_pathological.zig
@@ -0,0 +1,83 @@
+// Discovery test: pathological regex inputs run against the port's re_* API.
+// Goal is to surface failures across ports, not to assert behaviour.
+// Panel is the same in every port (see REGEX.md).
+
+const std = @import("std");
+const voxgig_struct = @import("voxgig-struct");
+
+fn ms_since(t0: i128) f64 {
+    return @as(f64, @floatFromInt(std.time.nanoTimestamp() - t0)) / 1e6;
+}
+
+test "regex pathological discovery" {
+    const writer = std.io.getStdOut().writer();
+    var arena = std.heap.ArenaAllocator.init(std.testing.allocator);
+    defer arena.deinit();
+    const alloc = arena.allocator();
+
+    const a22 = try alloc.alloc(u8, 22);
+    @memset(a22, 'a');
+    const p1_in = try std.fmt.allocPrint(alloc, "{s}!", .{a22});
+
+    var nest_buf: [120]u8 = undefined;
+    var pos: usize = 0;
+    while (pos < 40) : (pos += 1) nest_buf[pos] = '(';
+    nest_buf[pos] = 'a';
+    pos += 1;
+    var i: usize = 0;
+    while (i < 40) : (i += 1) {
+        nest_buf[pos] = ')';
+        pos += 1;
+    }
+    const nest40 = nest_buf[0..pos];
+
+    var t0 = std.time.nanoTimestamp();
+    const b1 = voxgig_struct.re_test("^(a+)+$", p1_in);
+    try writer.print("[regex-discovery] P1_redos_nested_plus | {d:.2}ms | OK | {}\n", .{ ms_since(t0), b1 });
+
+    t0 = std.time.nanoTimestamp();
+    const b2 = voxgig_struct.re_test("^(a|aa)+$", p1_in);
+    try writer.print("[regex-discovery] P2_redos_alt_overlap | {d:.2}ms | OK | {}\n", .{ ms_since(t0), b2 });
+
+    t0 = std.time.nanoTimestamp();
+    const p3 = try voxgig_struct.re_replace(alloc, "a*", "abc", "X");
+    try writer.print("[regex-discovery] P3_empty_repeat_replace | {d:.2}ms | OK | \"{s}\"\n", .{ ms_since(t0), p3 });
+
+    t0 = std.time.nanoTimestamp();
+    const p4 = try voxgig_struct.re_replace(alloc, "\\.", "café.au.lait", "/");
+    try writer.print("[regex-discovery] P4_unicode_replace_dot | {d:.2}ms | OK | \"{s}\"\n", .{ ms_since(t0), p4 });
+
+    t0 = std.time.nanoTimestamp();
+    if (voxgig_struct.re_find(alloc, "é", "café au lait")) |p5| {
+        try writer.print("[regex-discovery] P5_unicode_find_codepoint | {d:.2}ms | OK | [\"{s}\"]\n", .{ ms_since(t0), p5[0] });
+    } else {
+        try writer.print("[regex-discovery] P5_unicode_find_codepoint | {d:.2}ms | OK | null\n", .{ms_since(t0)});
+    }
+
+    t0 = std.time.nanoTimestamp();
+    const b6 = voxgig_struct.re_test(nest40, "a");
+    try writer.print("[regex-discovery] P6_deep_nesting_compile | {d:.2}ms | OK | {}\n", .{ ms_since(t0), b6 });
+
+    t0 = std.time.nanoTimestamp();
+    const b7 = voxgig_struct.re_test("^a{0,10000}b$", "aaaaaaaaaab");
+    try writer.print("[regex-discovery] P7_big_bounded_quantifier | {d:.2}ms | OK | {}\n", .{ ms_since(t0), b7 });
+
+    t0 = std.time.nanoTimestamp();
+    const p8 = voxgig_struct.re_compile("[abc");
+    if (p8 == null) {
+        try writer.print("[regex-discovery] P8_invalid_pattern | {d:.2}ms | ERR | compile returned null\n", .{ms_since(t0)});
+    } else {
+        try writer.print("[regex-discovery] P8_invalid_pattern | {d:.2}ms | OK | \"compiled\"\n", .{ms_since(t0)});
+    }
+
+    t0 = std.time.nanoTimestamp();
+    const b9 = voxgig_struct.re_test("^(a+)\\1$", "aaaa");
+    try writer.print("[regex-discovery] P9_backref_re2_forbidden | {d:.2}ms | OK | {}\n", .{ ms_since(t0), b9 });
+
+    t0 = std.time.nanoTimestamp();
+    if (voxgig_struct.re_find_all(alloc, "a*", "bbb")) |p10| {
+        try writer.print("[regex-discovery] P10_find_all_zero_width | {d:.2}ms | OK | <{} matches>\n", .{ ms_since(t0), p10.len });
+    } else {
+        try writer.print("[regex-discovery] P10_find_all_zero_width | {d:.2}ms | OK | null\n", .{ms_since(t0)});
+    }
+}