fix: mark partial archive scans inconclusive#907
Conversation
Performance BenchmarksCompared
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 6 minutes and 54 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (7)
WalkthroughThis PR introduces a mechanism to mark archive scans as inconclusive when analysis cannot be fully completed. Changes include a new utility function to update scan metadata, integration across ZIP/TAR/7z scanners to detect and flag incomplete traversals, a configurable nested scan callback for dispatch flexibility, and comprehensive test coverage validating inconclusive metadata propagation. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/scanners/sevenzip_scanner.py`:
- Around line 475-477: The current logic treats any scan_complete=False (driven
by _scan_extracted_file) as a partial traversal and calls
mark_archive_scan_incomplete even when the outer scan failed because of genuine
nested findings; change the condition so mark_archive_scan_incomplete is only
called for true partial/aborted traversals (e.g., budget.should_stop() or
traversal failure) and not when the failure is due to nested errors: update the
if guarding mark_archive_scan_incomplete to also check result.has_errors (or use
a traversal_incomplete flag derived from scan_complete and _scan_extracted_file)
so you only mark incomplete when there are no result.has_errors, then leave
result.finish(...) as-is.
In `@modelaudit/scanners/zip_scanner.py`:
- Around line 471-473: The current logic marks the archive inconclusive whenever
scan_complete is false, but scan_complete is also set false when any nested
member returns success=False (including critical findings) — causing fully
traversed ZIPs with real errors to be labeled INCONCLUSIVE. Change the condition
so mark_archive_scan_incomplete(result, "zip_analysis_incomplete") is only
called when traversal was actually incomplete (e.g., an explicit
analysis_incomplete flag or a nested warning-only fail-closed condition), not
for every nested failure; update the code paths that set/clear scan_complete
(and any nested result flags) so that only a dedicated traversal-incomplete
indicator triggers mark_archive_scan_incomplete before calling
result.finish(success=scan_complete and not result.has_errors).
In `@tests/scanners/test_sevenzip_scanner.py`:
- Around line 115-117: The test asserts for the SevenZip scanner currently check
metadata but do not ensure the result's warning/error flags reflect a
WARNING-only fail-closed path; update the assertions around the failing-py7zr
cases (the block that checks result.metadata["scan_outcome"] ==
INCONCLUSIVE_SCAN_OUTCOME and result.metadata["analysis_incomplete"]) to also
assert result.has_warnings is True and result.has_errors is False (do the same
for the second occurrence of this case later in the file), ensuring ScanResult
uses has_warnings for WARNING-severity failures and has_errors remains reserved
for CRITICAL issues.
In `@tests/scanners/test_tar_scanner.py`:
- Around line 737-740: The test callback nested_scan should match the real
contract: change its signature from nested_scan(_path: str, _config: dict[str,
Any] | None) -> ScanResult to accept a non-optional config (e.g.,
nested_scan(_path: str, _config: dict[str, Any]) -> ScanResult) because
_scan_nested_archive_entry always passes nested_config: dict[str, Any]; update
the nested_scan definition accordingly (it still returns ScanResult and calls
nested_result.finish(...)) to remove the unnecessary Optional type.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 77a1a5c0-c665-4185-bed2-15d31f382515
📒 Files selected for processing (8)
CHANGELOG.mdmodelaudit/scanners/_archive_outcomes.pymodelaudit/scanners/sevenzip_scanner.pymodelaudit/scanners/tar_scanner.pymodelaudit/scanners/zip_scanner.pytests/scanners/test_sevenzip_scanner.pytests/scanners/test_tar_scanner.pytests/scanners/test_zip_scanner.py
Summary
Validation
Summary by CodeRabbit
Bug Fixes
Tests