feat(xml/unstable): add XML parsing and serialization module #6942

tomas-zijdemans · 2026-01-07T22:09:13Z

New XML parsing and serialization module

What @std/xml has:

Streaming parser, DOM-style parser, serialization
Browser compatible, position tracking, spec-compliant

What @std/xml doesn't have:

Namespace resolution, DTD/Schema validation, HTML entities
Custom entities, XPath/selectors, object-to-XML builder

Benchmark Results

Performance work never really ends, and you often find yourself comparing apples and oranges. Anyway. Here goes.

The challengers

Library	XML Spec compliant?	Streaming XML parsing?	Error position tracking?
SAX	No	Yes	Yes
saxes	Yes	Yes	Yes
fast-xml-parser	No	No	Yes
txml	No	Yes	No
xml2js	No	No	Partial
htmlparser2	No	Yes	Partial
deno std	Yes	Yes	Yes (configurable)

Error position tracking is nice for debugging, but really hurts performance. So I made it an option that defaults to true for non-streaming and false for streaming (streaming is usually for trusted data sources. Multi-GB feeds or logs where throughput is critical). The results below contain both with and without error position tracking.

Test data

I used the test files located in testdata for non-streaming. I used one 597MB file for the streaming benchmark (google product data), but didn't check that into testdata. Other payloads may give different results.

Small Files (<10KB) — Median Results

Parser	Time (ms)	vs Deno std
txml	0.010	1.6x faster
Deno std (no pos)	0.010	1.6x faster
Deno std (+pos)	0.016	baseline
saxes	0.016	1.0x (same)
htmlparser2	0.021	1.3x slower
SAX	0.028	1.8x slower
fast-xml-parser	0.037	2.3x slower
xml2js	0.047	2.9x slower

1 Large File (301KB) — Median Results

Parser	Time (ms)	vs Deno std
txml	2.10	1.5x faster
saxes	2.40	1.3x faster
Deno std (no pos)	2.47	1.3x faster
Deno std (+pos)	3.11	baseline
htmlparser2	4.59	1.5x slower
SAX	7.94	2.6x slower
fast-xml-parser	11.60	3.7x slower
xml2js	14.67	4.7x slower

Streaming (a 597MB file) — Median Results

Parser	Time (s)	Throughput	vs Deno std
Deno std (no pos)	4.24	179K items/s	baseline
Deno std (+pos)	4.47	170K items/s	1.1x slower
saxes	5.24	145K items/s	1.2x slower
htmlparser2	6.44	118K items/s	1.5x slower
SAX	16.45	46K items/s	3.9x slower

crowlKats · 2026-01-08T01:48:18Z

could we get some benchmarks comparing to other parsers?

tomas-zijdemans · 2026-01-08T05:54:15Z

could we get some benchmarks comparing to other parsers?

Yes, that's a good idea. I'll look into it.

codecov · 2026-01-08T06:11:01Z

Codecov Report

❌ Patch coverage is 96.71339% with 68 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.23%. Comparing base (6b93b78) to head (fdd09f0).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
xml/_tokenizer.ts	95.29%	54 Missing and 3 partials ⚠️
xml/_parse_sync.ts	98.31%	4 Missing and 2 partials ⚠️
xml/_entities.ts	96.07%	4 Missing ⚠️
xml/parse_stream.ts	96.77%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6942      +/-   ##
==========================================
- Coverage   94.28%   94.23%   -0.06%     
==========================================
  Files         584      610      +26     
  Lines       43186    45609    +2423     
  Branches     6933     7501     +568     
==========================================
+ Hits        40720    42981    +2261     
- Misses       2413     2568     +155     
- Partials       53       60       +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

tomas-zijdemans · 2026-01-08T20:31:42Z

could we get some benchmarks comparing to other parsers?

Updated the description now. Let me know if you would like to benchmark against a specific package

…d serialization

tomas-zijdemans · 2026-01-08T21:15:17Z

import_map.json

Sorry, I have no idea why this formatting is happening 😅

timreichen · 2026-01-08T23:23:32Z

Ref: denoland/deno#24995
There was no reply if DOMParser or something similar was to be implemented in deno, so I like this PR in general.
However, it might be worth to check with the deno core team what their current stance is on this before merging anything.

tomas-zijdemans · 2026-01-09T09:05:16Z

Ref: denoland/deno#24995 There was no reply if DOMParser or something similar was to be implemented in deno, so I like this PR in general. However, it might be worth to check with the deno core team what their current stance is on this before merging anything.

Thanks, I was not aware of this discussion. Perhaps we could have it as an unstable module for now? Then we can always kick it out, should the core team decide to implement DOMParser

tomas-zijdemans · 2026-01-09T12:34:20Z

Updated again to increase streaming performance and get test coverage to 100%

tomas-zijdemans · 2026-01-14T22:23:48Z

More perf work. Will look into using callbacks instead of arrays of objects

tomas-zijdemans requested a review from kt3k as a code owner January 7, 2026 22:09

tomas-zijdemans force-pushed the xml branch from f99e4c1 to bda3483 Compare January 8, 2026 05:59

tomas-zijdemans force-pushed the xml branch 6 times, most recently from 7f792e6 to e20f201 Compare January 8, 2026 21:04

feat(xml): add XML module with streaming parser, DOM-style parser, an…

9b1fe20

…d serialization

tomas-zijdemans force-pushed the xml branch from e20f201 to 9b1fe20 Compare January 8, 2026 21:07

tomas-zijdemans commented Jan 8, 2026

View reviewed changes

import_map.json

Copy link

Contributor Author

tomas-zijdemans Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I have no idea why this formatting is happening 😅

tomas-zijdemans added 2 commits January 9, 2026 12:42

perf(xml): native TransformStream for 20% faster streaming

81445b5

refactor(xml): remove deprecated async generator APIs, sync all tests

a5ed5bf

tomas-zijdemans added 10 commits January 12, 2026 11:23

perf(xml): use switch statement for named entity decoding

749193c

perf(xml): replace object lookups with switch in entity encoding

605f0d6

perf(xml): use charCodeAt for tokenizer hot path

e3678c9

perf(xml): switch DOM parser to character code comparisons

dc79433

perf(xml): add fast path for attribute value normalization

566f083

refactor(xml): remove helper functions

9fdb310

perf(xml): optimize switch

ec7ccf1

perf(xml): cache hot variables

ceb8a1a

feat(xml/unstable): add error position tracking as an aption

1233275

perf(xml): introduce basic dedicated capture methods

1578a99

tomas-zijdemans added 5 commits January 14, 2026 21:02

perf(xml): optimize CDATA capture with indexOf batch scanning

dc6da86

refactor(xml): handle comment and PI capture

23f0a7b

perf(xml): XmlName Caching when streaming

6fa4da1

perf(xml): pending Start Element Reuse

b9140d1

perf(xml): optimize name parsing, add XmlName.raw property

ae340f2

tomas-zijdemans added 6 commits January 14, 2026 23:26

fix tests

dbd8ffe

feat(xml): callback based streaming core

94a37e9

feat(xml): direct streaming

d9f917b

feat(xml): use callbacks for parse

d5b2b2a

test coverage

6266a3b

fix(xml): avoid double parseName

fdd09f0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(xml/unstable): add XML parsing and serialization module #6942

feat(xml/unstable): add XML parsing and serialization module #6942

tomas-zijdemans commented Jan 7, 2026 •

edited

Loading

Uh oh!

crowlKats commented Jan 8, 2026

Uh oh!

tomas-zijdemans commented Jan 8, 2026

Uh oh!

codecov bot commented Jan 8, 2026 •

edited

Loading

Uh oh!

tomas-zijdemans commented Jan 8, 2026

Uh oh!

tomas-zijdemans Jan 8, 2026

Uh oh!

timreichen commented Jan 8, 2026

Uh oh!

tomas-zijdemans commented Jan 9, 2026

Uh oh!

tomas-zijdemans commented Jan 9, 2026

Uh oh!

tomas-zijdemans commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(xml/unstable): add XML parsing and serialization module #6942

Are you sure you want to change the base?

feat(xml/unstable): add XML parsing and serialization module #6942

Conversation

tomas-zijdemans commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New XML parsing and serialization module

Benchmark Results

The challengers

Test data

Small Files (<10KB) — Median Results

1 Large File (301KB) — Median Results

Streaming (a 597MB file) — Median Results

Uh oh!

crowlKats commented Jan 8, 2026

Uh oh!

tomas-zijdemans commented Jan 8, 2026

Uh oh!

codecov bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tomas-zijdemans commented Jan 8, 2026

Uh oh!

tomas-zijdemans Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

timreichen commented Jan 8, 2026

Uh oh!

tomas-zijdemans commented Jan 9, 2026

Uh oh!

tomas-zijdemans commented Jan 9, 2026

Uh oh!

tomas-zijdemans commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tomas-zijdemans commented Jan 7, 2026 •

edited

Loading

codecov bot commented Jan 8, 2026 •

edited

Loading