Skip to content

fix: harden standalone pickle scanner#901

Open
mldangelo-oai wants to merge 13 commits intomainfrom
mdangelo/codex/picklescan-audit-hardening
Open

fix: harden standalone pickle scanner#901
mldangelo-oai wants to merge 13 commits intomainfrom
mdangelo/codex/picklescan-audit-hardening

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

@mldangelo-oai mldangelo-oai commented Apr 10, 2026

Summary

This hardens the standalone modelaudit-picklescan package after the audit. The scanner previously kept policy tables, stream handling, nested pickle helpers, and result construction close together in one engine module, while some resource limits were fixed constants rather than caller-controlled options. That made the package harder to reason about and left gaps in high-risk callable coverage, suspicious string detection, nested payload surfacing, and result immutability.

Changes

  • Expands standalone dangerous-global and suspicious-string policy coverage, including higher-risk callables such as pip.main, numpy.load, shutil.rmtree, webbrowser.open, and builtins.globals.
  • Splits policy, bounded stream reading, and nested-pickle helper logic into dedicated engine modules to reduce scanner coupling.
  • Adds configurable budgets for string literal scanning, nested pickle byte limits, and nested scan depth, with ModelAudit adapter pass-through for those config values.
  • Surfaces findings from bounded nested pickle scans instead of only flagging the nested payload container, and fails closed with explicit incomplete-analysis details for oversized nested pickle prefixes.
  • Makes report details and metadata top-level mappings read-only after construction while preserving mutable to_dict() output for downstream consumers.
  • Updates standalone package docs, split-package agent notes, changelog, and tests around the hardened behavior.

Validation

  • uv run --with pytest pytest tests/test_api.py -q from packages/modelaudit-picklescan: 48 passed.
  • uv run --with pytest pytest tests/test_api.py tests/test_options.py -q from packages/modelaudit-picklescan: 69 passed.
  • uv run --with pytest pytest tests/test_report.py tests/test_api.py -q from packages/modelaudit-picklescan: 58 passed.
  • uv run --with ruff ruff check src tests from packages/modelaudit-picklescan: passed.
  • uv run --with ruff ruff format --check src tests from packages/modelaudit-picklescan: passed.
  • uv run --with mypy mypy src tests from packages/modelaudit-picklescan: passed.
  • uv run --with pytest --with pytest-xdist pytest -n auto tests --tb=short from packages/modelaudit-picklescan: 78 passed.
  • env PYTHONPATH=packages/modelaudit-picklescan/src uv run python scripts/compare_pickle_scanners.py --json: passed with no safe fixture drift and no status/verdict drift.
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/: passed.
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/: passed.
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/: passed.
  • uv run pytest -n auto -m "not slow and not integration" --maxfail=1: 3333 passed, 75 skipped, 16 warnings after syncing with origin/main.
  • uv build --out-dir /tmp/modelaudit-picklescan-dist packages/modelaudit-picklescan: passed.
  • uv run --with twine twine check /tmp/modelaudit-picklescan-dist/*: passed.

Notes

The parity harness still reports expected rule-code drift while the root PickleScanner compatibility fallback remains in place, but the drift did not change safe fixture verdicts/statuses.

Summary by CodeRabbit

  • New Features
    • Opt-in "standalone-primary" pickle scan mode; configurable limits for string-literal scanning, nested-pickle size, and nested scan depth; bounded nested/encoded-pickle analysis and improved compressed-file routing by magic bytes.
  • Bug Fixes
    • Fail-closed behavior when limits hit, deduplicated notices, and clearer handling of truncated/partial analyses.
  • Documentation
    • Clarified resource-control dimensions, migration behavior, and that report mappings are read-only—use to_dict() for mutable output.
  • Tests
    • Expanded coverage for truncation/fail-closed cases, immutability, compressed routing, suspicious-patterns, and migration modes.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds bounded, fail-closed literal and nested-pickle scanning to the standalone pickle engine, refactors nested/policy/stream helpers into new modules, extends ScanOptions with three resource limits, makes report mappings immutable, strengthens compressed-wrapper magic-byte routing, and updates adapter, scanner merge behavior, and tests.

Changes

Cohort / File(s) Summary
Changelog & Docs
CHANGELOG.md, docs/agents/picklescan-package-split.md, packages/modelaudit-picklescan/README.md, scripts/README.md
Updated changelog and docs to describe nested/ literal scan bounds, fail-closed/truncation notices, immutable report mappings, standalone-primary migration flag, wrapper-by-magic routing, and new CLI flags.
Options
packages/modelaudit-picklescan/src/modelaudit_picklescan/options.py
Added resource limits: max_string_literal_scan_chars, max_nested_pickle_bytes, max_nested_depth with defaults and strict type/non-negative validation.
New Engine Modules
packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/...
.../nested.py, .../policy.py, .../stream.py
Introduced nested-pickle decoding/detection, centralized global-severity and suspicious-string policy, and _BoundedPickleStream with bounded read/readline and wrapped read errors.
Scanner Core / Adapter
packages/modelaudit-picklescan/src/.../engine/scanner.py, modelaudit/scanners/pickle_scanner.py, modelaudit/scanners/picklescan_adapter.py
Refactored to use new modules; added nested_depth and deadline propagation; notice de-duplication; bounded literal windows; encoded/raw nested handling with truncation notices; added use_standalone_pickle_primary mode and merge flag propagate_fallback_state.
Report Immutability
packages/modelaudit-picklescan/src/modelaudit_picklescan/report.py
Frozen read-only Mapping views for Finding.details, Notice.details, ScanError.details, and PickleReport.metadata; added helpers and updated to_dict() serialization.
Compressed Scanner & Detection
modelaudit/scanners/compressed_scanner.py, modelaudit/utils/file/detection.py, modelaudit/scanner_registry_metadata.py, modelaudit/core.py
Added header_formats metadata, magic-byte-based wrapper routing for compressed codecs, concatenated-member and bounded decompression readers (zlib/bzip2/xz/lz4) with raw-trailer rejection; minor routing updates for R serialized extensions.
Tests & Scripts
packages/modelaudit-picklescan/tests/*, tests/scanners/*, tests/*, scripts/compare_pickle_scanners.py, tests/scripts/*
Extended and added tests covering suspicious-string matching, dangerous-call detection, nested-pickle surfacing and truncation, ScanOptions validation, report immutability, compressed-scanner LZ4/bzip2/xz behaviors, standalone-primary migration, and compare script root-mode/CLI flags.

Sequence Diagram(s)

sequenceDiagram
    participant Entry as Scanner Entry
    participant Stream as _BoundedPickleStream
    participant Policy as policy module
    participant Nested as nested module
    participant Engine as Standalone/Package Engine

    Entry->>Stream: open with byte_limit / deadline
    Entry->>Stream: read()/readline() for opcodes
    Stream-->>Entry: bytes or short_read / _StreamReadError

    alt STRING literal opcode
        Entry->>Entry: apply max_string_literal_scan_chars window
        Entry->>Policy: suspicious_string_matches(window)
        Policy-->>Entry: match / no match
        Entry->>Entry: emit SUSPICIOUS_STRING finding if match
        alt truncated window
            Entry->>Entry: emit literal_scan_truncated notice (analysis_incomplete)
        end
    end

    alt encoded/hex nested candidate
        Entry->>Nested: _decode_possible_encoded_pickle(value, max_nested_pickle_bytes)
        Nested-->>Entry: decoded bytes OR oversized-prefix signal
        alt decoded && looks_like_pickle
            Entry->>Engine: if nested_depth < max_nested_depth then recursive scan (nested_depth+1, deadline)
            Engine-->>Entry: nested PickleReport
            Entry->>Entry: surface nested findings/notices (remap)
        else oversized prefix
            Entry->>Entry: emit encoded_nested_payload_truncated notice (analysis_incomplete)
        end
    end

    Entry->>Entry: finalize PickleReport with immutable metadata
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐇 I nibble bytes in careful bounds,
I sniff encoded crumbs and rounds,
I flag when windows end too soon,
I hop between streams by moon,
Reports locked, I thump — a tidy tune.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.79% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: harden standalone pickle scanner' clearly and concisely summarizes the primary change: hardening the standalone pickle scanner with expanded policy coverage, resource limits, and fail-closed behavior.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mdangelo/codex/picklescan-audit-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mldangelo-oai mldangelo-oai marked this pull request as ready for review April 10, 2026 00:40
…scan-audit-hardening

# Conflicts:
#	CHANGELOG.md
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 10, 2026

Workflow run and artifacts

Performance Benchmarks

Compared 6 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 6 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 678.03ms -> 678.33ms (+0.0%).

Benchmark Target Size Files Baseline Current Change Status
tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle safe_model.pkl 49.4 KiB 1 161.5us 169.9us +5.2% stable
tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip state_dict.pt 1.5 MiB 1 48.5us 47.6us -1.8% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle safe_model.pkl 49.4 KiB 1 28.07ms 27.98ms -0.3% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory mixed-corpus 1.7 MiB 54 136.47ms 136.11ms -0.3% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory duplicate-corpus 840.0 KiB 81 479.63ms 480.33ms +0.1% stable
tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip state_dict.pt 1.5 MiB 1 33.65ms 33.69ms +0.1% stable

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1a209603a7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 734fd9c53b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/nested.py`:
- Around line 22-23: The hex truncate budget is wrong:
max_hex_nested_pickle_chars = max_nested_pickle_bytes * 2 only covers plain hex,
but escaped hex literals use "\xNN" (4 chars per byte), so update the logic
around max_hex_nested_pickle_chars (and the truncation that runs later) to
account for escaped form—either set max_hex_nested_pickle_chars =
max_nested_pickle_bytes * 4 so truncation on the escaped string inspects the
intended byte budget, or change the flow to unescape the string (remove "\x")
before applying a bytes-based truncation; adjust any code that slices/decodes
between lines 35-45 accordingly to ensure the byte budget applies to decoded
bytes, referencing max_hex_nested_pickle_chars and the truncation/unescape steps
in nested.py.

In `@packages/modelaudit-picklescan/src/modelaudit_picklescan/report.py`:
- Around line 39-44: The current _immutable_mapping and _mapping_to_dict
implementations only wrap the top-level dict (MappingProxyType(dict(value)))
leaving nested dicts/lists mutable; change _immutable_mapping to deep-freeze the
input before wrapping by recursively converting nested dicts to
MappingProxyType, lists/tuples to tuples, and sets to frozenset (start from a
deepcopy of value to avoid aliasing), and adjust _mapping_to_dict to return a
deep copy that does not share mutable nested objects; operate on the Mapping
passed to _immutable_mapping and _mapping_to_dict so Findng.details and
PickleReport.metadata become fully immutable even for nested contents.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d54c1056-287a-4ebc-92cd-b1ac8c85707c

📥 Commits

Reviewing files that changed from the base of the PR and between f4f661a and 734fd9c.

📒 Files selected for processing (14)
  • CHANGELOG.md
  • docs/agents/picklescan-package-split.md
  • modelaudit/scanners/picklescan_adapter.py
  • packages/modelaudit-picklescan/README.md
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/nested.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/policy.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/scanner.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/stream.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/options.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/report.py
  • packages/modelaudit-picklescan/tests/test_api.py
  • packages/modelaudit-picklescan/tests/test_options.py
  • packages/modelaudit-picklescan/tests/test_report.py
  • tests/scanners/test_picklescan_adapter.py

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bd7c6f6d99

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
CHANGELOG.md (1)

34-41: ⚠️ Potential issue | 🟡 Minor

Move this entry into the canonical [Unreleased] block.

The file now has two [Unreleased] sections. Adding new bullets here fragments the changelog and makes it ambiguous which unreleased block release tooling/readers should trust.

As per coding guidelines: CHANGELOG.md: “Add CHANGELOG entry under [Unreleased] section (Keep a Changelog format)”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CHANGELOG.md` around lines 34 - 41, The changelog contains duplicate
"[Unreleased]" sections: consolidate the new bullet lines ("avoid CoreML nested
parse...", "harden standalone pickle scanner...", "fail closed when standalone
pickle...") into the canonical [Unreleased] block by removing the extra
[Unreleased] header and appending these bullets under the existing [Unreleased]
section, ensuring there is only one "[Unreleased]" header and that all new
entries appear under that single block.
♻️ Duplicate comments (1)
packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/nested.py (1)

22-23: ⚠️ Potential issue | 🟠 Major

Apply the hex byte budget after accounting for escaped \xNN form.

This still under-scans escaped hex literals. max_nested_pickle_bytes * 2 is correct for plain hex, but here the slice happens before replace("\\x", ""), so \xNN payloads only get about half the configured decoded-byte budget. That creates a real false-negative path for escaped-hex nested pickles that are below the configured limit.

Suggested fix
-    max_hex_nested_pickle_chars = max_nested_pickle_bytes * 2
+    max_hex_nested_pickle_chars = max_nested_pickle_bytes * 4
@@
-    hex_candidate = stripped[:max_hex_nested_pickle_chars].replace("\\x", "")
+    hex_candidate = stripped[:max_hex_nested_pickle_chars].replace("\\x", "")[
+        : max_nested_pickle_bytes * 2
+    ]
@@
-        bounded_hex_candidate = hex_candidate[:max_hex_nested_pickle_chars]
+        bounded_hex_candidate = hex_candidate

Also applies to: 35-45

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/nested.py`
around lines 22 - 23, The hex char budget is computed as max_nested_pickle_bytes
* 2 which only fits plain "NN" pairs, but the code slices the input before
replace("\\x", "") so escaped '\xNN' sequences consume 4 chars each and will be
under-scanned; change the pre-replace budget to account for escaped form (e.g.
set max_hex_nested_pickle_chars = max_nested_pickle_bytes * 4 or compute length
= max_nested_pickle_bytes*2 + count_of_escaped_sequences*2) and apply the same
fix to the other occurrence mentioned (lines 35-45) so the slice length is sized
for '\xNN' payloads before calling replace("\\x",""); reference variables:
max_base64_nested_pickle_chars, max_hex_nested_pickle_chars,
max_nested_pickle_bytes and the replace("\\x","") call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/scanner.py`:
- Around line 866-909: The nested-analysis incomplete branch in
_surface_nested_pickle_findings currently only adds a Notice but leaves the
parent _ScanState.status as COMPLETE; update that branch so when
nested_report.status != ScanStatus.COMPLETE you also set the parent scan status
(self.status) to ScanStatus.INCONCLUSIVE (or the appropriate "inconclusive" enum
variant used by the project) to mark the parent scan as not fully completed,
preserving the existing Notice creation and include reference to
nested_report.status and nested_report.errors/notices in the details.

In `@tests/scanners/test_picklescan_adapter.py`:
- Around line 280-309: The test
test_pickle_report_to_scan_result_fails_closed_for_truncated_literal_scan_notice
should assert the operator-facing fail-closed notice text in addition to
existing success/metadata checks: after calling
pickle_report_to_scan_result(report), locate the emitted check with name
"Standalone Pickle Notice" (or inspect result.checks for check.rule_code ==
"S902" and check.details["pickle_notice_code"] == "literal_scan_truncated") and
add an assertion that its message/text equals the expected fail-closed message
string (the human-facing notice used when analysis_incomplete is True), ensuring
the test locks the exact notice content so the fail-closed message cannot
regress; apply the same pattern to the similar tests around lines 312-350.

---

Outside diff comments:
In `@CHANGELOG.md`:
- Around line 34-41: The changelog contains duplicate "[Unreleased]" sections:
consolidate the new bullet lines ("avoid CoreML nested parse...", "harden
standalone pickle scanner...", "fail closed when standalone pickle...") into the
canonical [Unreleased] block by removing the extra [Unreleased] header and
appending these bullets under the existing [Unreleased] section, ensuring there
is only one "[Unreleased]" header and that all new entries appear under that
single block.

---

Duplicate comments:
In `@packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/nested.py`:
- Around line 22-23: The hex char budget is computed as max_nested_pickle_bytes
* 2 which only fits plain "NN" pairs, but the code slices the input before
replace("\\x", "") so escaped '\xNN' sequences consume 4 chars each and will be
under-scanned; change the pre-replace budget to account for escaped form (e.g.
set max_hex_nested_pickle_chars = max_nested_pickle_bytes * 4 or compute length
= max_nested_pickle_bytes*2 + count_of_escaped_sequences*2) and apply the same
fix to the other occurrence mentioned (lines 35-45) so the slice length is sized
for '\xNN' payloads before calling replace("\\x",""); reference variables:
max_base64_nested_pickle_chars, max_hex_nested_pickle_chars,
max_nested_pickle_bytes and the replace("\\x","") call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fdd0de2a-12d4-4f9f-8dd3-c63e4056c330

📥 Commits

Reviewing files that changed from the base of the PR and between 734fd9c and bd7c6f6.

📒 Files selected for processing (6)
  • CHANGELOG.md
  • modelaudit/scanners/picklescan_adapter.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/nested.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/scanner.py
  • packages/modelaudit-picklescan/tests/test_api.py
  • tests/scanners/test_picklescan_adapter.py

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 792b3962b8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 08d8855f76

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/scanner.py`:
- Around line 853-864: The current _bounded_encoded_nested_windows computes
max_chars from the worst-case escaped-hex size and eagerly slices two large
windows for every long literal, which can allocate attacker-controlled memory;
instead, perform a cheap per-encoding candidate check and only create windows
sized for that encoding. In _bounded_encoded_nested_windows, keep
max_base64_chars and max_hex_chars calculations but first cheaply test the value
for likely encoding (e.g., quick membership checks: if value contains only hex
digits and escape patterns treat as hex, else if characters are within base64
alphabet treat as base64, otherwise handle as plain), then compute max_chars
from the chosen encoding (or 16 minimum) and only slice prefix/suffix using that
smaller max_chars; preserve the existing prefix==suffix dedup logic and return
types unchanged.
- Around line 866-876: The recursive call in _surface_nested_pickle_findings
currently launches scan_pickle_payload which creates a fresh _ScanState and
gives nested scans a full timeout; instead capture the remaining wall-clock
budget from the current scan state and pass it into the nested scan so recursion
consumes the same timeout budget. Modify the call in
_surface_nested_pickle_findings (and/or scan_pickle_payload invocation) to
compute remaining_timeout from the current _ScanState (e.g., via a method or
attribute on self._scan_state) and pass that remaining timeout into the nested
scan (either by forwarding a timeout_s/remaining_timeout parameter or by
constructing the nested scan state with the remaining value) so nested payloads
do not receive a full new timeout.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f2c2c7be-593b-4f5a-93e8-367e89aa9302

📥 Commits

Reviewing files that changed from the base of the PR and between bd7c6f6 and 394fcf7.

📒 Files selected for processing (8)
  • modelaudit/scanners/picklescan_adapter.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/nested.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/policy.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/scanner.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/report.py
  • packages/modelaudit-picklescan/tests/test_api.py
  • packages/modelaudit-picklescan/tests/test_report.py
  • tests/scanners/test_picklescan_adapter.py

@mldangelo-oai mldangelo-oai enabled auto-merge (squash) April 10, 2026 05:27
* fix: add standalone-primary pickle migration mode

* fix: isolate standalone pickle primary fallback state

* fix: narrow legacy pickle stdlib policy (#903)

* fix: narrow legacy pickle stdlib policy

Remove broad CRITICAL treatment for noisy stdlib modules in the root pickle scanner while preserving exact dangerous helper coverage and warning-level suspicious refs.

* fix: preserve exact stdlib helper detections

Keep logging configuration loaders and uuid subprocess getnode helpers as exact dangerous callables in both root and standalone pickle policies.

* fix: flag uuid getnode pickle calls

Keep uuid.getnode in the exact dangerous-call policy after narrowing broad uuid module severity, since it can dispatch to platform helper probes that invoke subprocess-backed paths.

Cover both root PickleScanner and standalone picklescan policy regressions.

* fix: route misnamed compressed wrappers by header (#904)

* fix: route misnamed compressed wrappers by header

Detect gzip, bzip2, xz, lz4, and zlib wrappers from magic bytes even when the filename uses a misleading extension. Preserve existing joblib and R serialized scanner precedence, and add positive/negative routing regressions.

* fix: reject compressed wrapper raw trailers (#905)

* fix: reject compressed wrapper raw trailers

Use bounded low-level bzip2 and xz decompression so valid first members cannot hide raw unscanned trailing bytes. Preserve concatenated valid-member support and add fail-closed regressions.

* fix: bound compressed decompression probes

Cap bzip2/xz per-call decompression max_length to a chunk-sized probe so a single compressed chunk cannot allocate up to the full remaining scan budget before limit checks run.

* fix: bound zlib decompression probes

Use the same chunk-sized decompression probe for zlib streams so per-call output is bounded before size and ratio limit checks run.

* fix: harden lz4 wrapper decoding (#906)

* fix: harden lz4 wrapper decoding

Use the lz4 frame decompressor API through the bounded concatenated-stream loop so raw trailers fail closed and per-call output probes stay chunk-sized.

Add fake-frame regressions for optional lz4 coverage.

* fix: support legacy lz4 chunk decoding

Fall back to lz4.frame decompression contexts when LZ4FrameDecompressor is unavailable while keeping bounded output probes and raw-trailer rejection.

Add fallback-specific lz4 regressions for bounded reads, raw trailers, and concatenated frames.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
modelaudit/scanners/compressed_scanner.py (1)

493-555: ⚠️ Potential issue | 🟠 Major

Delete the temp file when decompression fails.

If any _read_*_stream_with_limits() call raises here, _decompress_to_tempfile() exits before returning and the caller never learns the temp_path. Because the file was created with delete=False, the partially written temp file is leaked on every corrupt/oversized input.

Suggested fix
     with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as temp_file:
         temp_path = temp_file.name
-
-        with open(path, "rb") as source:
-            if codec == "gzip":
-                try:
-                    with gzip.GzipFile(fileobj=source, mode="rb") as reader:
-                        total_out = self._copy_stream_with_limits(
-                            source=reader,
-                            destination=temp_file,
-                            max_decompressed_bytes=self.max_decompressed_bytes,
-                            max_ratio=self.max_decompression_ratio,
-                            compressed_size=compressed_size,
-                            chunk_size=self.chunk_size,
-                        )
-                except (OSError, EOFError, gzip.BadGzipFile) as exc:
-                    raise _CorruptStreamError(f"Invalid gzip stream: {exc}") from exc
-            elif codec == "bzip2":
-                total_out = self._read_bzip2_stream_with_limits(
-                    source=source,
-                    destination=temp_file,
-                    max_decompressed_bytes=self.max_decompressed_bytes,
-                    max_ratio=self.max_decompression_ratio,
-                    compressed_size=compressed_size,
-                    chunk_size=self.chunk_size,
-                )
-            elif codec == "xz":
-                total_out = self._read_xz_stream_with_limits(
-                    source=source,
-                    destination=temp_file,
-                    max_decompressed_bytes=self.max_decompressed_bytes,
-                    max_ratio=self.max_decompression_ratio,
-                    compressed_size=compressed_size,
-                    chunk_size=self.chunk_size,
-                )
-            elif codec == "lz4":
-                lz4_frame = self._get_lz4_frame_module()
-                total_out = self._read_lz4_stream_with_limits(
-                    source=source,
-                    destination=temp_file,
-                    lz4_frame=lz4_frame,
-                    max_decompressed_bytes=self.max_decompressed_bytes,
-                    max_ratio=self.max_decompression_ratio,
-                    compressed_size=compressed_size,
-                    chunk_size=self.chunk_size,
-                )
-            elif codec == "zlib":
-                total_out = self._read_zlib_stream_with_limits(
-                    source=source,
-                    destination=temp_file,
-                    max_decompressed_bytes=self.max_decompressed_bytes,
-                    max_ratio=self.max_decompression_ratio,
-                    compressed_size=compressed_size,
-                    chunk_size=self.chunk_size,
-                )
-            else:
-                raise _CorruptStreamError(f"Unsupported compression codec: {codec}")
+        try:
+            with open(path, "rb") as source:
+                if codec == "gzip":
+                    try:
+                        with gzip.GzipFile(fileobj=source, mode="rb") as reader:
+                            total_out = self._copy_stream_with_limits(
+                                source=reader,
+                                destination=temp_file,
+                                max_decompressed_bytes=self.max_decompressed_bytes,
+                                max_ratio=self.max_decompression_ratio,
+                                compressed_size=compressed_size,
+                                chunk_size=self.chunk_size,
+                            )
+                    except (OSError, EOFError, gzip.BadGzipFile) as exc:
+                        raise _CorruptStreamError(f"Invalid gzip stream: {exc}") from exc
+                elif codec == "bzip2":
+                    total_out = self._read_bzip2_stream_with_limits(...)
+                elif codec == "xz":
+                    total_out = self._read_xz_stream_with_limits(...)
+                elif codec == "lz4":
+                    total_out = self._read_lz4_stream_with_limits(...)
+                elif codec == "zlib":
+                    total_out = self._read_zlib_stream_with_limits(...)
+                else:
+                    raise _CorruptStreamError(f"Unsupported compression codec: {codec}")
+        except Exception:
+            if os.path.exists(temp_path):
+                os.unlink(temp_path)
+            raise
 
         return temp_path, total_out
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/compressed_scanner.py` around lines 493 - 555, The
temporary file created in _decompress_to_tempfile using
tempfile.NamedTemporaryFile(suffix=suffix, delete=False) can be leaked if any of
the codec handlers (_copy_stream_with_limits, _read_bzip2_stream_with_limits,
_read_xz_stream_with_limits, _read_lz4_stream_with_limits,
_read_zlib_stream_with_limits) raises an exception; wrap the decompression logic
in a try/except/finally (or try/finally) that removes/unlinks temp_path on any
exception before re-raising so the partially written file is always deleted,
ensuring temp_file.name (temp_path) is cleaned up when errors occur.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 349-350: The early-return that checks propagate_fallback_state and
_is_operational_pickle_issue(check) is too narrow: when propagate_fallback_state
is False you must also suppress legacy scanner-limitation findings and avoid
copying fallback metadata; update the conditional(s) that currently use
_is_operational_pickle_issue(check) (and the similar checks at the other
early-return sites) to also detect/skip checks with legacy states like
"scanner_limitation", "recursion_limit_exceeded",
"memory_limit_on_legitimate_model" (or add a helper like
_is_legacy_scanner_limitation(check) and use it alongside
_is_operational_pickle_issue), and remove or sanitize fallback keys
("parsing_failed", "failure_reason", "scanner_limitation") from the merged
metadata before returning so standalone-primary mode does not leak legacy
incomplete-state noise.
- Around line 4232-4238: The legacy-pass exceptions (RecursionError and those
matching _is_pickle_parse_failure) are being re-raised before the
use_standalone_pickle_primary branch, so when standalone produced a usable
package_result we still fail; change the control flow in the scanner method so
exceptions from the legacy/fallback parse are caught and not re-raised if
self.use_standalone_pickle_primary is True and the standalone package_result is
valid—specifically, wrap the legacy parse/merge block to catch RecursionError
and exceptions where _is_pickle_parse_failure(error) is True, defer or suppress
re-raising, and let the existing standalone branch that calls
_merge_missing_pickle_checks(package_result, legacy_result, ...) run and return
package_result (setting
package_result.metadata["pickle_primary_engine"]="standalone") when appropriate.
Ensure you reference the same symbols: use_standalone_pickle_primary,
package_result, legacy_result, _merge_missing_pickle_checks, RecursionError, and
_is_pickle_parse_failure.

In `@packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/policy.py`:
- Around line 24-167: The new top-level policy constants lack type annotations;
add explicit types (and necessary typing imports) so mypy can validate this
security surface: annotate _BUILTIN_MODULES and _BUILTIN_DANGEROUS_NAMES and
_DANGEROUS_WILDCARD_MODULES as FrozenSet[str] (or frozenset[str] if using Python
3.9+), and annotate _DANGEROUS_GLOBALS as FrozenSet[Tuple[str, str]] (or
frozenset[tuple[str, str]]), and import the corresponding typing names
(FrozenSet, Tuple) if needed; keep the existing values but add the type hints
next to each constant name.
- Around line 48-167: The policy lists in _DANGEROUS_GLOBALS and
_DANGEROUS_WILDCARD_MODULES are missing entries present in the legacy canonical
table (e.g., base64.b64decode and codecs.decode/encode), causing
global_severity() to return None for those symbols; update policy.py so its
_DANGEROUS_GLOBALS and/or _DANGEROUS_WILDCARD_MODULES are derived from or
synchronized with the canonical table used by
modelaudit/detectors/suspicious_symbols.py (or explicitly add the missing tuples
such as ("base64","b64decode"), ("codecs","decode"), ("codecs","encode") to
_DANGEROUS_GLOBALS) and ensure global_severity() behavior remains parity with
the legacy detector (run/change tests to validate standalone-primary parity).

In `@packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/scanner.py`:
- Around line 588-603: The oversized-raw branch in _scan_raw_nested_pickle_bytes
currently records a nested finding but does not mark the overall scan/report as
incomplete or emit the user-facing notice; update that branch to mirror the
encoded-path behavior by (1) marking the parent scan/report as incomplete using
the same helper the encoded path uses (e.g., call the scan-incomplete helper the
codebase uses) and (2) emit the same notice/message about analysis being
incomplete so callers see the skipped deep-scan; keep the existing
_add_nested_payload_finding call (with analysis_incomplete=True) and ensure any
surface logic in _surface_nested_pickle_findings is not invoked when you
early-return.

In `@tests/scanners/test_compressed_scanner.py`:
- Around line 512-530: The test
test_compressed_scanner_rejects_raw_trailer_after_bzip2_or_xz_member currently
asserts result.success is False and no CRITICAL issues but misses the repo
contract around warning-only failures; update this test (and the other similar
tests around the 580-623 range) to also assert result.has_warnings is True and
result.has_errors is False so that CompressedScanner scan results preserve
WARNING vs CRITICAL semantics (refer to the result variable,
CompressedScanner.scan, and ScanResult.has_warnings/has_errors to locate and
implement the assertions).

In `@tests/scanners/test_pickle_scanner.py`:
- Around line 1021-1029: The test should additionally assert that the merged
result remains warning-only and that any post-merge scan outcome reasons are
removed: after calling _merge_missing_pickle_checks(target, fallback,
propagate_fallback_state=False) add assertions that target.has_warnings is True,
target.has_errors is False, and "scan_outcome_reasons" not in target.metadata;
keep the existing checks for the S777 issue and absence of the "Legacy Pickle
Parse Failure" check to ensure the fallback/merge preserved only the
warning-level evidence.
- Around line 5033-5109: The tests
test_benign_stdlib_neighbors_do_not_inherit_broad_module_criticality and
test_uuid_object_pickle_does_not_inherit_uuid_module_criticality currently only
assert absence of CRITICAL findings; update them to assert there are no failing
checks at all for the benign references. Concretely, in
test_benign_stdlib_neighbors_do_not_inherit_broad_module_criticality replace the
any(...) predicate that checks for check.status == CheckStatus.FAILED and
check.severity == IssueSeverity.CRITICAL with a predicate that checks only for
check.status == CheckStatus.FAILED (and full_ref in check.message) so any
WARNING or CRITICAL failure will fail the test; similarly in
test_uuid_object_pickle_does_not_inherit_uuid_module_criticality change the
any(...) that filters by severity to check only for check.status ==
CheckStatus.FAILED and "uuid.UUID" in check.message.

In `@tests/scripts/test_compare_pickle_scanners.py`:
- Around line 155-158: The test assumes a stable ordering by using
report["comparisons"][0], which can be flaky; update the assertion to select the
desired comparison deterministically (e.g., locate the comparison whose "root"
has the expected path or sort report["comparisons"] by a stable key such as
"path" before indexing). In the test_compare_pickle_scanners.py test, replace
the use of first_comparison = report["comparisons"][0] with logic that finds the
comparison where comparison["root"]["path"] (or another invariant) matches the
expected root, then assert on that comparison's "root_delta", "root"]["engine"],
and "metadata"]["pickle_primary_engine"] to avoid order coupling.

---

Outside diff comments:
In `@modelaudit/scanners/compressed_scanner.py`:
- Around line 493-555: The temporary file created in _decompress_to_tempfile
using tempfile.NamedTemporaryFile(suffix=suffix, delete=False) can be leaked if
any of the codec handlers (_copy_stream_with_limits,
_read_bzip2_stream_with_limits, _read_xz_stream_with_limits,
_read_lz4_stream_with_limits, _read_zlib_stream_with_limits) raises an
exception; wrap the decompression logic in a try/except/finally (or try/finally)
that removes/unlinks temp_path on any exception before re-raising so the
partially written file is always deleted, ensuring temp_file.name (temp_path) is
cleaned up when errors occur.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e0e96e2b-1628-44b0-b787-d45ab30a62bf

📥 Commits

Reviewing files that changed from the base of the PR and between 394fcf7 and 630bdcb.

📒 Files selected for processing (17)
  • CHANGELOG.md
  • docs/agents/picklescan-package-split.md
  • modelaudit/core.py
  • modelaudit/detectors/suspicious_symbols.py
  • modelaudit/scanner_registry_metadata.py
  • modelaudit/scanners/compressed_scanner.py
  • modelaudit/scanners/pickle_scanner.py
  • modelaudit/utils/file/detection.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/policy.py
  • packages/modelaudit-picklescan/src/modelaudit_picklescan/engine/scanner.py
  • packages/modelaudit-picklescan/tests/test_api.py
  • scripts/README.md
  • scripts/compare_pickle_scanners.py
  • tests/scanners/test_compressed_scanner.py
  • tests/scanners/test_pickle_scanner.py
  • tests/scripts/test_compare_pickle_scanners.py
  • tests/test_core.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Around line 217-368: The Unreleased compare target in the CHANGELOG footer is
still pointing to v0.2.28...HEAD after adding releases through 0.2.33; update
the footer link (the `[Unreleased]` compare target) to use v0.2.33...HEAD (or
equivalent latest tag) so the Unreleased diff excludes already-released commits
and restores Keep-a-Changelog navigation semantics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 769270b2-b0f3-4102-8807-b52212ac428c

📥 Commits

Reviewing files that changed from the base of the PR and between 630bdcb and db85fec.

📒 Files selected for processing (1)
  • CHANGELOG.md

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db85fec9b6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant