fix: mark malformed GGUF scans inconclusive#914
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 18 minutes and 3 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThe PR hardens GGUF/GGML scanning by treating malformed parse boundaries and unknown tensor types as inconclusive outcomes instead of proceeding with default assumptions. The scanner now fails closed when inconclusive unless critical security issues are present. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Performance BenchmarksCompared
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/scanners/test_gguf_scanner.py`:
- Around line 236-270: These tests add new fail-closed paths but don't assert
cached-rerun behavior; add a second scan invocation for each case that hits the
cache and verify the cached results preserve the inconclusive metadata,
scan_outcome_reasons, and exit code 2. Concretely, after the initial direct =
GgufScanner().scan(...) and aggregate = scan_model_directory_or_file(...), call
GgufScanner().scan(...) (or scan_model_directory_or_file(...)) a second time for
the same path, capture the cached_direct/cached_aggregate, and assert
cached_direct.metadata["scan_outcome"] == INCONCLUSIVE_SCAN_OUTCOME, the same
"gguf_parse_incomplete" or "gguf_structure_validation_failed" reason is present
in cached_direct.metadata["scan_outcome_reasons"], and
determine_exit_code(cached_aggregate) == 2 (and mirror the same checks for
issues/success flags as in the original assertions). Ensure you reference the
existing test functions test_gguf_truncated_metadata_returns_exit2 and
test_gguf_unknown_tensor_type_is_inconclusive and reuse GgufScanner().scan,
scan_model_directory_or_file, and determine_exit_code to validate cache
semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 27b1eb22-26b7-428a-81a5-b2f363a868e9
📒 Files selected for processing (3)
CHANGELOG.mdmodelaudit/scanners/gguf_scanner.pytests/scanners/test_gguf_scanner.py
Summary
Validation
Summary by CodeRabbit