fix: mark incomplete MXNet scans inconclusive by mldangelo-oai · Pull Request #923 · promptfoo/modelaudit

mldangelo-oai · 2026-04-10T20:54:36Z

Summary

This PR fixes an aggregate false-clean result in the MXNet scanner. Corrupt or truncated MXNet artifacts could produce direct scanner success=false while the aggregate scan still reported success and exit code 0 because the failure was INFO-level and lacked explicit scan_outcome=inconclusive metadata.

Root Cause

The MXNet scanner returned analysis_complete=false for unreadable, malformed, empty, invalid, or truncated artifacts, but it did not attach scan outcome metadata. ModelAudit exit-code aggregation intentionally treats INFO findings as non-security findings, so incomplete MXNet scans without WARNING/CRITICAL findings could be collapsed into a clean aggregate result.

Changes

Add an MXNet inconclusive-result helper that records scan_outcome, scan_outcome_reasons, and analysis_incomplete metadata.
Mark unsupported extensions, symbol read/parse/structure failures, symbol truncation, params read failures, empty params, and truncated params as inconclusive.
Preserve security-finding precedence: recovered WARNING/CRITICAL findings still drive security exit code 1.
Add aggregate regressions for empty and truncated params returning exit code 2.
Add direct malformed-symbol regression coverage.
Update the Unreleased changelog.

Validation

uv run pytest tests/scanners/test_mxnet_scanner.py: 13 passed
uv run ruff format modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/: 432 source files clean
uv run pytest -n auto -m "not slow and not integration" --maxfail=1: 2433 passed, 1100 skipped, 16 warnings
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
git diff --check

Summary by CodeRabbit

Release Notes

Bug Fixes
- MXNet artifact scans are now properly marked as inconclusive when analysis cannot be completed, with detailed reason tracking for empty files, truncation, parse failures, and unsupported extensions.
Tests
- Expanded test coverage for inconclusive MXNet scan outcomes and validated exit code reporting for incomplete analyses.

coderabbitai · 2026-04-10T20:54:53Z

Warning

Rate limit exceeded

@mldangelo-oai has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 8 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 14 minutes and 8 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4cbd2051-53a6-4dcd-9aa1-e355b729099d

📥 Commits

Reviewing files that changed from the base of the PR and between cf08a09 and 402d4e1.

📒 Files selected for processing (2)

modelaudit/scanners/mxnet_scanner.py
tests/scanners/test_mxnet_scanner.py

Walkthrough

The changes implement explicit handling of inconclusive MXNet artifact scans. When artifact reads or parsing fail (truncation, corruption, JSON parse errors, etc.), scans are marked as inconclusive with tracked reasons. Results fail only if analysis is inconclusive and no security findings were recovered.

Changes

Cohort / File(s)	Summary
Changelog `CHANGELOG.md`	Added bug fix entry documenting that incomplete MXNet artifact scans are now marked as inconclusive.
MXNet Scanner Implementation `modelaudit/scanners/mxnet_scanner.py`	Introduced `INCONCLUSIVE_SCAN_OUTCOME` handling with metadata tracking (`scan_outcome`, `analysis_incomplete`, `scan_outcome_reasons`). Added helper functions `_scan_result_has_security_findings()` and `_finish_mxnet_result()` to distinguish between failures with recovered findings vs. truly inconclusive outcomes. Multiple read/parse failure paths now mark scans as inconclusive instead of failing directly.
Test Coverage `tests/scanners/test_mxnet_scanner.py`	Expanded test assertions to validate inconclusive outcomes for empty/corrupt params and malformed symbol JSON. Added aggregate directory/file scan tests verifying inconclusive metadata propagation and exit code computation. Updated existing corrupt-params test with metadata assertions.

Sequence Diagram(s)

sequenceDiagram
    participant Scanner as MXNetScanner
    participant Result as ScanResult
    participant Metadata as metadata dict
    
    Scanner->>Scanner: scan artifact
    alt Artifact Read/Parse Succeeds
        Scanner->>Result: process findings
        Scanner->>Result: finish with success status
    else Artifact Read Fails (truncated/empty/invalid)
        Scanner->>Scanner: _mark_inconclusive_scan_result()
        Scanner->>Metadata: set analysis_incomplete = True
        Scanner->>Metadata: append reason to scan_outcome_reasons
        Scanner->>Scanner: _scan_result_has_security_findings()
        alt Found Security Findings
            Scanner->>Result: finish as success
        else No Security Findings
            Scanner->>Result: finish as failure
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A scanner learns to say "I'm unsure, but found some fright,"
When artifacts break mid-scan and parsing fails to light.
No false success when knowledge is incomplete—
Just honest inconclusive marks, making logic sweet! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: mark incomplete MXNet scans inconclusive' directly and accurately summarizes the main change: marking incomplete MXNet scans as inconclusive rather than passing/failing.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch mdangelo/codex/mxnet-outcome-audit

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-10T20:55:09Z

Workflow run and artifacts

Performance Benchmarks

Compared 6 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 1 improved, 5 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 617.11ms -> 614.73ms (-0.4%).

Top improvements:

tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip -39.6% (56.9us -> 34.4us, state_dict.pt, size=1.5 MiB, files=1)

Benchmark	Target	Size	Files	Baseline	Current	Change	Status
`tests/benchmarks/test_scan_benchmarks.py::test_validate_file_type_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	56.9us	34.4us	-39.6%	improved
`tests/benchmarks/test_scan_benchmarks.py::test_detect_file_format_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	118.2us	120.2us	+1.6%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_pytorch_zip`	`state_dict.pt`	1.5 MiB	1	31.22ms	31.59ms	+1.2%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_safe_pickle`	`safe_model.pkl`	49.4 KiB	1	25.55ms	25.74ms	+0.8%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_directory`	`duplicate-corpus`	840.0 KiB	81	438.21ms	435.30ms	-0.7%	stable
`tests/benchmarks/test_scan_benchmarks.py::test_scan_mixed_directory`	`mixed-corpus`	1.7 MiB	54	121.96ms	121.94ms	-0.0%	stable

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/scanners/test_mxnet_scanner.py (1)
1-228: 🧹 Nitpick | 🔵 Trivial

Consider adding coverage for remaining inconclusive reasons.

The new tests cover mxnet_params_empty, mxnet_params_truncated, and mxnet_symbol_parse_failed. For comprehensive regression coverage, consider adding tests for the other inconclusive paths in a follow-up:

mxnet_unsupported_extension

mxnet_symbol_read_failed (OSError on read)

mxnet_symbol_truncated (bounded-read truncation)

mxnet_symbol_empty

mxnet_symbol_invalid_structure

mxnet_params_read_failed (OSError on read)

Would you like me to open an issue to track adding these additional test cases?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/scanners/test_mxnet_scanner.py` around lines 1 - 228, Add unit tests
exercising the remaining inconclusive scan reasons by creating small test
functions that call MXNetScanner.scan and scan_model_directory_or_file similar
to existing tests: (1) mxnet_unsupported_extension — create a file with a
non-matching extension and assert metadata["scan_outcome_reasons"] contains
"mxnet_unsupported_extension"; (2) mxnet_symbol_read_failed and
mxnet_params_read_failed — simulate OSError on read by monkeypatching
Path.read_text/Path.read_bytes (or open) to raise OSError and assert the
appropriate reason is set; (3) mxnet_symbol_truncated — write a partially
truncated JSON (bounded-read style) and assert "mxnet_symbol_truncated"; (4)
mxnet_symbol_empty — write an empty symbol file and assert "mxnet_symbol_empty";
(5) mxnet_symbol_invalid_structure — write valid JSON that lacks required keys
(e.g., missing "nodes"/"heads") and assert "mxnet_symbol_invalid_structure". Use
the same helper functions (_write_symbol_file/_write_params_file),
MXNetScanner.scan, scan_model_directory_or_file, and determine_exit_code to
validate outcomes and exit codes where applicable.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@tests/scanners/test_mxnet_scanner.py`:
- Around line 1-228: Add unit tests exercising the remaining inconclusive scan
reasons by creating small test functions that call MXNetScanner.scan and
scan_model_directory_or_file similar to existing tests: (1)
mxnet_unsupported_extension — create a file with a non-matching extension and
assert metadata["scan_outcome_reasons"] contains "mxnet_unsupported_extension";
(2) mxnet_symbol_read_failed and mxnet_params_read_failed — simulate OSError on
read by monkeypatching Path.read_text/Path.read_bytes (or open) to raise OSError
and assert the appropriate reason is set; (3) mxnet_symbol_truncated — write a
partially truncated JSON (bounded-read style) and assert
"mxnet_symbol_truncated"; (4) mxnet_symbol_empty — write an empty symbol file
and assert "mxnet_symbol_empty"; (5) mxnet_symbol_invalid_structure — write
valid JSON that lacks required keys (e.g., missing "nodes"/"heads") and assert
"mxnet_symbol_invalid_structure". Use the same helper functions
(_write_symbol_file/_write_params_file), MXNetScanner.scan,
scan_model_directory_or_file, and determine_exit_code to validate outcomes and
exit codes where applicable.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 035369bb-6f8a-477d-8078-5160ca90e370

📥 Commits

Reviewing files that changed from the base of the PR and between f285a05 and cf08a09.

📒 Files selected for processing (3)

CHANGELOG.md
modelaudit/scanners/mxnet_scanner.py
tests/scanners/test_mxnet_scanner.py

tests/scanners/test_mxnet_scanner.py


+import pytest
+
+import modelaudit.scanners.mxnet_scanner as mxnet_scanner


fix: mark incomplete mxnet scans inconclusive

cf08a09

mldangelo-oai enabled auto-merge (squash) April 10, 2026 20:54

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

test: cover mxnet inconclusive outcomes

402d4e1

github-code-quality bot found potential problems Apr 10, 2026

View reviewed changes

tests/scanners/test_mxnet_scanner.py

import pytest

import modelaudit.scanners.mxnet_scanner as mxnet_scanner

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: mark incomplete MXNet scans inconclusive#923

fix: mark incomplete MXNet scans inconclusive#923
mldangelo-oai wants to merge 2 commits intomainfrom
mdangelo/codex/mxnet-outcome-audit

mldangelo-oai commented Apr 10, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 10, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 warning)

Uh oh!

github-actions bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		import pytest

		import modelaudit.scanners.mxnet_scanner as mxnet_scanner

Conversation

mldangelo-oai commented Apr 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Changes

Validation

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Benchmarks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mldangelo-oai commented Apr 10, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 10, 2026 •

edited

Loading

github-actions bot commented Apr 10, 2026 •

edited

Loading