BQN FFI bindings for PCRE2 (Perl Compatible Regular Expressions).
bqnpcre provides a BQN interface to the PCRE2 regular expression library
via •FFI. It exposes five functions: Search, Replace,
Split, FindAll, and Compile.
- CBQN — the BQN implementation with
•FFIsupport - PCRE2:
libpcre2-8-dev(Debian/Ubuntu) orpcre2-devel(Fedora/RHEL) - A C compiler (gcc or clang)
make # produces libbqnpcre.so
make test # build and run tests
make clean # remove build artifactsbqnpcre is designed to be used in-place or as a git submodule. There is no system install step.
Clone the repo and build:
git clone https://github.com/linuxhd0/bqnpcre
cd bqnpcre
makeImport from your BQN program using the path to bqn/pcre2.bqn:
pcre ← •Import "/path/to/bqnpcre/bqn/pcre2.bqn"git submodule add https://github.com/linuxhd0/bqnpcre vendor/bqnpcre
cd vendor/bqnpcre && make && cd ../..pcre ← •Import "vendor/bqnpcre/bqn/pcre2.bqn"The first line of bqn/pcre2.bqn sets the path to the shared library:
lib ← "../libbqnpcre.so"This path is resolved relative to bqn/pcre2.bqn itself (not the working
directory), so it works correctly regardless of where your program runs.
If you move libbqnpcre.so to a different location, update this line.
For example, to use an absolute path:
lib ← "/usr/local/lib/libbqnpcre.so"pcre ← •Import "bqn/pcre2.bqn"
# Search - one-shot (compiles, matches, frees automatically)
# opts may be omitted: plain string or ⟨pattern, opts⟩ both work
result ← "(a+)b" pcre.Search "aaab"
result
┌─
╵ "aaab" 0 4
"aaa" 0 3
┘
⊏result
⟨ "aaab" 0 4 ⟩
⊏˘result
⟨ "aaab" "aaa" ⟩
1⊏˘result
⟨ 0 0 ⟩
2⊏˘result
⟨ 4 3 ⟩
# Replace - one-shot
# opts may be omitted: ⟨pattern, replacement⟩ or ⟨pattern, replacement, opts⟩
⟨"hello", "world"⟩ pcre.Replace "hello world"
"world world"
# Split - one-shot
"," pcre.Split "a,b,c"
⟨ "a" "b" "c" ⟩
# FindAll - one-shot
# Returns a list of n×3 arrays, one per match, to preserve per-match grouping
# when capture groups are present. Use ∾ to get a flat array when there are no
# capture groups.
fa ← "a+" pcre.FindAll "aabaa"
⊑fa
┌─
╵ "aa" 0 2
┘
1⊑fa
┌─
╵ "aa" 3 5
┘
∾fa
┌─
╵ "aa" 0 2
"aa" 3 5
┘
# FindAll with capture groups - list of n×3 arrays preserves per-match grouping
fa2 ← "([0-9]+)-([0-9]+)" pcre.FindAll "2024-03 and 2025-07"
fa2
⟨ ┌─ ┌─ ⟩
╵ "2024-03" 0 7 ╵ "2025-07" 12 19
"2024" 0 4 "2025" 12 16
"03" 5 7 "07" 17 19
┘ ┘
# Full match string from each match
(⊑⊏˘)¨fa2
⟨ "2024-03" "2025-07" ⟩
# Capture strings only (drop group 0 row)
{0⊏˘1↓𝕩}¨fa2
⟨ ⟨ "2024" "03" ⟩ ⟨ "2025" "07" ⟩ ⟩
# Compile a pattern for reuse - returns a namespace
# opts may be omitted: plain string or ⟨pattern, opts⟩ both work
pat ← pcre.Compile "(a+)b"
pat.ngroups
2
# Match with compiled pattern
result ← pat.Match "aaab"
result
┌─
╵ "aaab" 0 4
"aaa" 0 3
┘
# FindAll with compiled pattern
pat_a ← pcre.Compile "a+"
pat_a.FindAll "aabaa"
⟨ ┌─ ┌─ ⟩
╵ "aa" 0 2 ╵ "aa" 3 5
┘ ┘
# Split with compiled pattern
pat_c ← pcre.Compile ","
pat_c.Split "a,b,c"
⟨ "a" "b" "c" ⟩
# Replace with compiled pattern (𝕨=replacement, 𝕩=subject)
pat_h ← pcre.Compile "hello"
"world" pat_h.Replace "hello world"
"world world"
# Free compiled pattern when done (caller is responsible)
pat.Free @
pat_a.Free @
pat_c.Free @
pat_h.Free @
# Named capture groups
pat ← pcre.Compile "(?P<year>[0-9]+)-(?P<month>[0-9]+)"
names ← pat.names
names
┌─
╵ "month" 2
"year" 1
┘
⊏˘names
⟨ "month" "year" ⟩
1⊏˘names
⟨ 2 1 ⟩
# Look up a named group and extract its match value
result ← pat.Match "2024-03"
year_num ← 1⊑((⊑(⊏˘names)⊐⟨"year"⟩)⊏names)
year_num⊑⊏˘result
"2024"
pat.Free @BQN strings are UTF-32 internally. bqnpcre transparently encodes all string arguments to UTF-8 before passing them to PCRE2, and decodes results back to BQN strings. All match offsets (start and end) are codepoint offsets, not byte offsets.
Literal Unicode characters in patterns and subjects work automatically:
"⌊" pcre.Search "a⌊b"
┌─
╵ "⌊" 1 2
┘
⟨"⌊","★"⟩ pcre.Replace "a⌊b"
"a★b"For patterns that use Unicode-aware constructs (character classes, dot,
quantifiers applied to multi-byte characters), pass the "u" option to
enable PCRE2's UTF mode:
⟨"[⌊★]","u"⟩ pcre.FindAll "a⌊b★c"
⟨ ┌─ ┌─ ⟩
╵ "⌊" 1 2 ╵ "★" 3 4
┘ ┘Compiles a regex pattern for reuse. Returns a namespace. The caller is
responsible for freeing the compiled code by calling pat.Free @ when done.
pattern: Regex pattern stringoptions: Options string (case-insensitive); omit to use no options:i— Case insensitivem— Multiline modes— Dotall (.matches newlines)x— Extended modeu— UTF modea— Anchored (match only at start of subject)
The returned namespace has the following fields:
| Field | Description |
|---|---|
pattern |
Pattern string |
opts |
Options string |
ngroups |
Number of groups including group 0 |
names |
Named groups: rank-2 array, each row is name‿group_number, alphabetical order; 0‿2⥊@ if none |
Match |
pat.Match subject — same return as pcre.Search |
FindAll |
pat.FindAll subject — same return as pcre.FindAll |
Split |
pat.Split subject — same return as pcre.Split |
Replace |
repl pat.Replace subject — same return as pcre.Replace |
Free |
pat.Free @ — frees the compiled code |
One-shot search: compiles, matches, and frees the compiled code automatically.
Returns the same n×3 array as Match. On no match, returns @. On error, throws.
Find all non-overlapping matches of pattern in subject.
pattern: Regex pattern stringopts: Options string (same flags asCompile); omit to use no optionssubject: String to search
Returns a list of n×3 arrays, one per match, with the same row structure as
Match. Offsets are absolute positions in the original subject. Each match is a
separate array to preserve per-match grouping when capture groups are present;
use ∾ to get a flat array when there are no capture groups. Returns ⟨⟩ if
no matches. On error, throws.
Split subject on all occurrences of pattern.
pattern: Regex pattern stringopts: Options string (same flags asCompile); omit to use no optionssubject: String to split
Returns a list of strings. On no match, returns a list containing the whole subject unchanged. On error, throws.
Replace all occurrences.
pattern: Regex pattern stringreplacement: Replacement stringopts: Options string (same flags asCompile); omit to use no optionssubject: String to replace in
Returns the result string. On no match, returns the subject unchanged. On error, throws.
Passing a non-string value (number, nested array, etc.) as any string
argument (pattern, options, subject, or replacement) throws
"PCRE2: expected a string argument".
src/bqnpcre.c— C wrapper around PCRE2src/bqnpcre.h— Header filebqn/pcre2.bqn— BQN FFI moduleMakefile— Build systemtest/test.bqn— Test suite
-
BQN — https://mlochbaum.github.io/BQN
The array programming language. Language specification and documentation.
-
CBQN — https://github.com/dzaima/CBQN
The C implementation of BQN. bqnpcre requires CBQN specifically for its
•FFIsystem value, which allows BQN programs to call functions in shared libraries. -
PCRE2 — https://github.com/PCRE2Project/pcre2
Perl Compatible Regular Expressions version 2. bqnpcre wraps the 8-bit PCRE2 library (
libpcre2-8).