Skip to content

Make remaining bytes validation conditional on strict mode#112

Open
libark wants to merge 3 commits intokixelated:mainfrom
libark:feature/strict-validation
Open

Make remaining bytes validation conditional on strict mode#112
libark wants to merge 3 commits intokixelated:mainfrom
libark:feature/strict-validation

Conversation

@libark
Copy link

@libark libark commented Jan 19, 2026

This PR makes non-strict builds more tolerant of leftover bytes in MP4 atom bodies, improving robustness when decoding real-world files. Strict validation remains unchanged for tests and strict-mode builds.

@coderabbitai
Copy link

coderabbitai bot commented Jan 19, 2026

Walkthrough

The change conditions remaining-body checks during atom decoding on build configuration. In src/any.rs, src/atom.rs, and src/tokio/atom.rs, checks that previously returned UnderDecode when leftover bytes remained are now executed only when the strict feature is enabled or under cfg(test). In non-strict, non-test builds leftover bytes in atom bodies are no longer treated as UnderDecode. No public signatures were changed.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: making remaining bytes validation conditional on strict mode rather than always applied.
Description check ✅ Passed The description clearly explains the purpose and scope of the changes, relating directly to the modifications in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can generate walkthrough in a markdown collapsible section to save space.

Enable the reviews.collapse_walkthrough setting to generate walkthrough in a markdown collapsible section.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/atom.rs`:
- Around line 66-68: The trailing-bytes rejection in the ReadFrom<Option<T>>
implementation is currently unconditional and inconsistent with the earlier
check that only errors when cfg!(feature = "strict") || cfg!(test); update the
Option<T> ReadFrom impl (and the similar occurrence around the other block at
lines referenced in the review) to gate the UnderDecode(T::KIND) return behind
the same condition (cfg!(feature = "strict") || cfg!(test)) so non-strict builds
tolerate trailing bytes like the other path; locate the check inside the impl
ReadFrom for Option<T> and the other matching block and wrap or replace the
unconditional Err(...) with the conditional check used earlier.

@bradh
Copy link
Collaborator

bradh commented Jan 21, 2026

Apart from the issue flagged by the review bot, it'd be good to test this. A bit tricky with the test == strict case.

And from my review on #111:

Did this behaviour get mandated in ISOBMFF? I recall some discussion, but not the outcome.

@libark
Copy link
Author

libark commented Jan 21, 2026

And from my review on #111:

Did this behaviour get mandated in ISOBMFF? I recall some discussion, but not the outcome.

ISOBMFF does not mandate tail padding or alignment for boxes. The only strict requirement is that the box size correctly covers all bytes.

MOV writers often pad atoms with zeros to align to 4 or 8 byte boundaries. This is a legacy convention, not required by ISOBMFF, and parsers should handle it.

@bradh
Copy link
Collaborator

bradh commented Jan 22, 2026

The modified behaviour requires tests. Unsure of the best way to do this - probably turning off strict mode whenever we're testing, and checking with and without the feature flag enabled.

This is a legacy convention, not required by ISOBMFF, and parsers should handle it.

OK. Can you add the quicktime reference for this as an explanatory comment in the code?

@kixelated
Copy link
Owner

Should we log when the entire box isn't processed, similar to when unknown atoms are parsed?

The reason I added UnderDecode was primarily to catch implementation bugs and it's saved me a bunch of times. We can explicitly skip boxes that often have padding if that's true @libark, double checking any specs and maybe validating that it's zeroed or something.

@libark
Copy link
Author

libark commented Jan 23, 2026

Should we log when the entire box isn't processed, similar to when unknown atoms are parsed?

The reason I added UnderDecode was primarily to catch implementation bugs and it's saved me a bunch of times. We can explicitly skip boxes that often have padding if that's true @libark, double checking any specs and maybe validating that it's zeroed or something.

Checking whether the remaining bytes are zeroed makes sense from a correctness perspective, but it does come with a performance cost.

Using a strict mode here seems like a good compromise: it allows catching potential issues during development, while preserving optimal performance and compatibility for regular crate users.

@kixelated
Copy link
Owner

Checking whether the remaining bytes are zeroed makes sense from a correctness perspective, but it does come with a performance cost..

I doubt there would be any measurable performance cost.

Could you make a helper method similar to the one for unknown boxes? Return under-decode when strict, log a warning otherwise?

To avoid spurious logging, we could also check for zeroed padding and only when the logger is enabled?

@kixelated kixelated mentioned this pull request Jan 26, 2026
@bradh
Copy link
Collaborator

bradh commented Jan 26, 2026

Is the "residual" padding always to a 4 byte or 8 byte boundary? Maybe that is a simpler (or additional) check.

@libark libark force-pushed the feature/strict-validation branch from 6639808 to 9f698b9 Compare March 22, 2026 09:22
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/tokio/atom.rs (1)

40-42: Consider adding lightweight diagnostics for ignored trailing bytes in non-strict mode.

When the check is bypassed, trailing bytes are silently dropped. A debug/warn log here would improve field observability without changing decode behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tokio/atom.rs` around lines 40 - 42, When the strict/test check is
bypassed and buf.has_remaining() is true the trailing bytes are silently
dropped; add a lightweight diagnostic log right where that condition is
evaluated (use the same location around the buf.has_remaining() check in
src/tokio/atom.rs) so non-strict decodes emit a debug or warn message (e.g.
using debug! or warn!) that includes T::KIND and the number of remaining bytes
from buf.has_remaining()/buf.remaining() without changing the current return
behavior or error handling (leave Error::UnderDecode branch intact for
strict/test). Ensure the log is conditional so it only runs in non-strict builds
and does not alter decode semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/tokio/atom.rs`:
- Around line 40-42: When the strict/test check is bypassed and
buf.has_remaining() is true the trailing bytes are silently dropped; add a
lightweight diagnostic log right where that condition is evaluated (use the same
location around the buf.has_remaining() check in src/tokio/atom.rs) so
non-strict decodes emit a debug or warn message (e.g. using debug! or warn!)
that includes T::KIND and the number of remaining bytes from
buf.has_remaining()/buf.remaining() without changing the current return behavior
or error handling (leave Error::UnderDecode branch intact for strict/test).
Ensure the log is conditional so it only runs in non-strict builds and does not
alter decode semantics.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 19ef28ef-4c36-428a-85a8-0b5e3c05fdbb

📥 Commits

Reviewing files that changed from the base of the PR and between 6639808 and 9f698b9.

📒 Files selected for processing (3)
  • src/any.rs
  • src/atom.rs
  • src/tokio/atom.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/atom.rs
  • src/any.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants