ocrdrift

Stop debugging OCR regressions by diffing JSON with your eyeballs.

ocrdrift is a local-first CLI and visual report generator for comparing two OCR or document-extraction runs. It highlights exactly what changed — fields, confidences, token samples, and table rows — and overlays drift directly on the source page.

Built for OCR, VLM, IDP, and document AI engineers who need to answer questions like:

Which fields broke after a preprocessing change?
Did my prompt/model upgrade improve extraction or just move errors around?
Which invoice/receipt/table regions became unstable?
Where did confidence collapse even when values still matched?

Why this exists

Document AI teams have parsers, OCR engines, benchmarks, and evaluation notebooks.

What they often do not have is a fast, developer-friendly way to debug extraction drift visually.

That gap gets painful when:

a layout tweak causes one field to disappear
a VLM “mostly works” but silently corrupts tax or totals
a table parser changes row alignment after a model upgrade
a preprocessing step improves one document family and breaks another

ocrdrift is built to be the missing inspection layer between raw extraction output and production confidence.

First-glance demo

npm install
npm run demo
open out/demo-report/index.html

That generates a side-by-side report comparing a clean invoice extraction against a noisy/rotated variant.

What you get

visual HTML report with side-by-side document pages
field drift table with changed / missing / added status
confidence deltas to catch “looks right but got weaker” regressions
table drift summary for row-level parsing changes
token sample diff to inspect OCR damage quickly
Tesseract TSV adapter so existing OCR outputs can be converted immediately
zero paid APIs and no external services required

Example report

After npm run demo, open:

out/demo-report/index.html
out/demo-report/report.json

The demo intentionally shows:

Industrial → lndustrial
Layout Analyzer → Layout AnaIyzer
26.82 → 2G.82
confidence drops around the due date and totals region

So you can see the kind of regression report engineers actually need during pipeline iteration.

Install

npm install
npm link

Then use:

ocrdrift --help

CLI

1) Run the built-in demo

ocrdrift demo --out ./out/demo-report

2) Compare two extraction JSON files

ocrdrift compare \
  --a ./examples/invoice-baseline.json \
  --b ./examples/invoice-rotated.json \
  --out ./out/my-report

3) Convert Tesseract TSV into `ocrdrift` JSON

ocrdrift adapt:tesseract \
  --input ./examples/sample.tesseract.tsv \
  --image ./examples/invoice-baseline.svg \
  --out ./out/sample.from-tesseract.json

Input schema

ocrdrift uses a small open JSON format so you can adapt any OCR or extraction pipeline into it.

{
  "schemaVersion": "ocrdrift/v1",
  "documentId": "invoice-001",
  "engine": "my-ocr-pipeline",
  "imagePath": "./page-1.png",
  "pages": [{ "number": 1, "width": 1000, "height": 1400, "imagePath": "./page-1.png" }],
  "tokens": [
    {
      "id": "tok-1",
      "text": "Invoice",
      "confidence": 0.98,
      "page": 1,
      "bbox": { "x": 100, "y": 80, "width": 120, "height": 32 }
    }
  ],
  "fields": {
    "invoice_number": {
      "value": "INV-2026-0319",
      "confidence": 0.96,
      "page": 1,
      "bbox": { "x": 680, "y": 132, "width": 210, "height": 34 }
    }
  },
  "tables": [
    {
      "name": "line_items",
      "rows": [{ "description": "Vision SDK", "qty": "2", "unit_price": "149.00", "line_total": "298.00" }]
    }
  ]
}

Who should use this

OCR platform teams
Document parsing / IDP engineers
invoice and receipt extraction teams
VLM prompt / pipeline experimenters
benchmark/evaluation owners who need a visual debugging layer

Why developers might adopt it quickly

tiny install surface
no hosted backend
works with existing OCR output after light adaptation
demoable in under 2 minutes
obvious value on first run

Architecture

OCR / VLM / parser output A ─┐
                             ├─> ocrdrift compare ──> normalized diff model ──> HTML report + JSON summary
OCR / VLM / parser output B ─┘

Tesseract TSV ──> ocrdrift adapt:tesseract ──> ocrdrift JSON

Core modules:

src/compare.js — field/table/token diff logic
src/render.js — self-contained HTML report renderer
src/adapters/tesseract.js — starter adapter for Tesseract TSV
src/demo-data.js — built-in realistic demo fixtures

Roadmap

adapters for PaddleOCR, docTR, Azure Form Recognizer, Google Document AI, and generic LLM extraction JSON
multi-page PDF support
field-group scoring and rule-based severity tuning
CLI batch mode for regression suites
CI summary output for pull requests
overlay heatmaps for token loss / confidence collapse

Competitive angle

Most tooling in document AI is optimized for extraction.

ocrdrift is optimized for inspection, debugging, and regression review.

That makes it useful not only during model evaluation, but during day-to-day engineering work.

Repo structure

ocrdrift/
├─ bin/
├─ src/
│  ├─ adapters/
│  ├─ cli.js
│  ├─ compare.js
│  ├─ demo-data.js
│  └─ render.js
├─ examples/
├─ tests/
├─ docs/
└─ launch/

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
bin		bin
docs		docs
examples		examples
launch		launch
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ocrdrift

Why this exists

First-glance demo

What you get

Example report

Install

CLI

1) Run the built-in demo

2) Compare two extraction JSON files

3) Convert Tesseract TSV into `ocrdrift` JSON

Input schema

Who should use this

Why developers might adopt it quickly

Architecture

Roadmap

Competitive angle

Repo structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ocrdrift

Why this exists

First-glance demo

What you get

Example report

Install

CLI

1) Run the built-in demo

2) Compare two extraction JSON files

3) Convert Tesseract TSV into ocrdrift JSON

Input schema

Who should use this

Why developers might adopt it quickly

Architecture

Roadmap

Competitive angle

Repo structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

3) Convert Tesseract TSV into `ocrdrift` JSON

Packages