Skip to content

docs: add repo correctness audit ledger#921

Merged
mldangelo-oai merged 3 commits intomainfrom
mdangelo/codex/repo-correctness-audit-ledger
Apr 11, 2026
Merged

docs: add repo correctness audit ledger#921
mldangelo-oai merged 3 commits intomainfrom
mdangelo/codex/repo-correctness-audit-ledger

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

@mldangelo-oai mldangelo-oai commented Apr 10, 2026

Summary

  • add a repo-wide correctness audit ledger with explicit proof obligations
  • inventory every scanner and cross-cutting layer with current evidence levels
  • record current boundary-hardening findings and the next high-risk audit backlog

Validation

  • uv run ruff format modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run pytest -n auto -m "not slow and not integration" --maxfail=1
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • git diff --check

Summary by CodeRabbit

  • Documentation
    • Added comprehensive audit documentation outlining quality standards and verification processes for the codebase.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

Walkthrough

A new documentation file establishes a repo-wide correctness audit ledger defining proof obligations across routing, parsing, security, and resource usage. It specifies evidence levels (E0–E4), audit scope coverage, scanner inventory tracking, and an iterative audit workflow with a current findings table and high-risk backlog.

Changes

Cohort / File(s) Summary
Correctness Audit Ledger
docs/agents/repo-correctness-audit.md
New documentation defining correctness standards with proof obligations, evidence levels, audit scope coverage, scanner inventory, workflow procedures, PR ledger with findings, high-risk items, and notes log.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 Hop along, dear code, with standards so clear,
A correctness ledger now holds repo dear,
With proof obligations mapped out with care,
And audit workflows floating through the air,
We'll chase those bugs and fix them all—cheer!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'docs: add repo correctness audit ledger' directly and accurately summarizes the main change: adding a new documentation file that establishes a repo-wide correctness audit ledger with explicit proof obligations and audit tracking.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mdangelo/codex/repo-correctness-audit-ledger

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mldangelo-oai mldangelo-oai enabled auto-merge (squash) April 10, 2026 20:39
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/agents/repo-correctness-audit.md`:
- Around line 91-137: The docs table in docs/agents/repo-correctness-audit.md
will drift from the canonical scanner registry; update the workflow to derive
this table from the source scanner_registry_metadata.py instead of manual edits:
add a script (e.g., generate_scanner_inventory_doc) that reads SCANNER_REGISTRY
(or the module-level registry/metadata in scanner_registry_metadata.py), emits
the markdown table and a generated-at timestamp + scanner count, and wire that
script into CI (or commit its output) so docs are regenerated automatically;
update the README/table header to note it is autogenerated from
scanner_registry_metadata.py.
- Around line 154-156: Run Prettier to fix the markdown lint errors (MD013
line-length violations and MD018) in this document: execute the recommended
command to install dev deps and reformat the file (npm ci --ignore-scripts &&
npx prettier --write docs/agents/repo-correctness-audit.md), review the
resulting changes around the “Earlier open PRs from the same boundary-hardening
campaign include `#901` and `#907` through `#916`” paragraph to confirm MD018 is
resolved, and commit the formatted file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0a228a09-7e3f-4bdb-ad26-660bd07834e2

📥 Commits

Reviewing files that changed from the base of the PR and between f285a05 and 1591c1e.

📒 Files selected for processing (1)
  • docs/agents/repo-correctness-audit.md

Comment on lines +91 to +137
| Scanner | Primary files/formats | Current evidence | Next proof target |
| --------------------- | ---------------------------------------------------------- | ---------------- | --------------------------------------------------------------- |
| `pickle` | `.pkl`, `.pickle`, `.dill`, `.bin`, `.pt`, `.pth`, `.ckpt` | E3 | post-budget and malformed opcode corpus parity |
| `picklescan_adapter` | standalone picklescan bridge | E3 | adapter/cache equivalence for inconclusive reports |
| `pytorch_zip` | ZIP-backed PyTorch checkpoints | E3 | ZIP metadata parse boundaries and nested pickle cache semantics |
| `pytorch_binary` | raw `.bin` PyTorch-like blobs | E1 | bounded binary fallback and benign weight near-matches |
| `joblib` | `.joblib`, compressed/raw pickle wrappers | E3 | codec failure semantics and cache preservation |
| `jax_checkpoint` | JAX/Orbax/checkpoint pickles | E1 | index/metadata structure failures and nested pickle routing |
| `flax_msgpack` | `.msgpack`, `.flax`, `.orbax`, `.jax` | E1 | msgpack extension types, depth, and partial unpack coverage |
| `numpy` | `.npy`, `.npz` | E3 | object-array pickle failures and `.npz` member routing |
| `safetensors` | `.safetensors` | E3 | malformed header/schema and dtype consistency |
| `keras_h5` | HDF5 Keras models | E3, PR #917 | cache and aggregate semantics after malformed config fixes |
| `keras_zip` | `.keras` ZIP models | E3, PR #918 | metadata/weights alias ambiguity after malformed config fixes |
| `tf_savedmodel` | SavedModel dirs, `.pb` | E1 | protobuf parse budgets and function library edges |
| `tf_metagraph` | `.meta` | E1 | protobuf parse budgets and attr truncation semantics |
| `tflite` | `.tflite`, routed `.bin` | E3, PR #916 | flatbuffer table bounds and custom-op recovery |
| `onnx` | `.onnx` | E3, PR #915 | external data path policy and dtype coverage |
| `coreml` | `.mlmodel` | E3 | protobuf truncation, linked model paths, custom layer strings |
| `openvino` | `.xml` IR | E3 | XML parse failures, entity/DOCTYPE boundaries, companion `.bin` |
| `gguf` | `.gguf`, `.ggml`, related | E3, PR #914 | metadata value type matrix and tensor offset checks |
| `xgboost` | `.bst`, `.model`, `.json`, `.ubj` | E1 | JSON/UBJSON malformed root, subprocess isolation |
| `lightgbm` | `.model`, `.txt`, `.lgb`, `.lightgbm` | E1 | text parser bounds and native-library indicators |
| `catboost` | `.cbm` | E3, PR #924 | binary marker bounds and metadata strings |
| `mxnet` | `*-symbol.json`, `*-NNNN.params` | E3, PR #923 | graph reference traversal and metadata payload recovery |
| `nemo` | `.nemo` tar archives | E3, PR #919 | multi-config precedence and malformed member combinations |
| `jinja2_template` | tokenizer configs, YAML, templates, GGUF metadata | E3, PR #920 | cache preservation and GGUF metadata extraction failures |
| `skops` | `.skops` ZIP archives | E3 | JSON schema variations and duplicate member precedence |
| `torchserve_mar` | `.mar` archives | E3 | manifest schema roots and handler AST edge cases |
| `oci_layer` | OCI `.manifest` | E3 | manifest schema roots, local-vs-remote layer resolution |
| `zip` | generic ZIP/NPZ/MAR fallback | E3 | unsupported member failure semantics and cleanup |
| `tar` | tar families | E3 | unsupported member failure semantics and cleanup |
| `sevenzip` | `.7z` | E3 | nested routing parity with ZIP/TAR |
| `compressed` | `.gz`, `.bz2`, `.xz`, `.lz4`, `.zlib` | E3 | wrapper extension inference and temporary cleanup |
| `manifest` | model/config manifests | E3, PR #922 | JSON/YAML/TOML malformed roots and nested scanning |
| `metadata` | model cards/docs/text | E1 | secret/security pattern false positives and truncation |
| `text` | general text docs | E0 | duplicate responsibility with metadata/manifest |
| `pmml` | `.pmml` | E3 | XML parse boundaries and extension payload recovery |
| `paddle` | `.pdmodel`, `.pdiparams` | E3, PR #925 | protobuf/op descriptor parse failures |
| `cntk` | `.dnn`, `.cmf` | E3 | split reference tracking and malformed binary handling |
| `rknn` | `.rknn` | E1 | marker and string extraction bounds |
| `torch7` | `.t7`, `.th`, `.net` | E1 | legacy serialization parse failures |
| `r_serialized` | `.rds`, `.rda`, `.rdata` | E1 | format header variants and string extraction bounds |
| `executorch` | `.ptl`, `.pte` | E1 | archive/table parse failures and nested payloads |
| `tensorrt` | `.engine`, `.plan`, `.trt` | E3 | plugin marker matrix and binary truncation |
| `llamafile` | `.llamafile`, `.exe`, extensionless | E1 | executable header routing and model payload boundaries |
| `weight_distribution` | optional secondary analysis | E0 | optional dependency isolation and non-security failure behavior |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Reduce scanner-inventory drift against registry metadata.

This table is high-value but manually curated. It will drift from modelaudit/scanner_registry_metadata.py unless you pin a source snapshot (e.g., scanner count + generated-at note, or scripted generation).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/agents/repo-correctness-audit.md` around lines 91 - 137, The docs table
in docs/agents/repo-correctness-audit.md will drift from the canonical scanner
registry; update the workflow to derive this table from the source
scanner_registry_metadata.py instead of manual edits: add a script (e.g.,
generate_scanner_inventory_doc) that reads SCANNER_REGISTRY (or the module-level
registry/metadata in scanner_registry_metadata.py), emits the markdown table and
a generated-at timestamp + scanner count, and wire that script into CI (or
commit its output) so docs are regenerated automatically; update the
README/table header to note it is autogenerated from
scanner_registry_metadata.py.

Comment on lines +154 to +156
Earlier open PRs from the same boundary-hardening campaign include #901 and
#907 through #916. All open PR entries remain provisional until CI and review
complete; treat them as evidence of audited findings, not landed behavior.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify markdown lint/format status for this file
fd -i "repo-correctness-audit.md" docs -x sed -n '150,160p' {}
npx markdownlint-cli2 "docs/agents/repo-correctness-audit.md"

Repository: promptfoo/modelaudit

Length of output: 8509


Run prettier to fix markdown formatting violations.

The file has 68 markdown linting errors. Line 155 has the MD018 error you mentioned, but the primary issues are MD013 (line-length) violations across lines 79–87, 91–136, and 142–152. Per the coding guideline, format the file using:

npm ci --ignore-scripts && npx prettier --write docs/agents/repo-correctness-audit.md

This will resolve all linting errors, including the MD018 issue on line 155.

🧰 Tools
🪛 markdownlint-cli2 (0.22.0)

[warning] 155-155: No space after hash on atx style heading

(MD018, no-missing-space-atx)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/agents/repo-correctness-audit.md` around lines 154 - 156, Run Prettier
to fix the markdown lint errors (MD013 line-length violations and MD018) in this
document: execute the recommended command to install dev deps and reformat the
file (npm ci --ignore-scripts && npx prettier --write
docs/agents/repo-correctness-audit.md), review the resulting changes around the
“Earlier open PRs from the same boundary-hardening campaign include `#901` and
`#907` through `#916`” paragraph to confirm MD018 is resolved, and commit the
formatted file.

@mldangelo-oai mldangelo-oai merged commit 06be0b6 into main Apr 11, 2026
6 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/repo-correctness-audit-ledger branch April 11, 2026 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant