Fix ParseBom corrupting stream state on inputs shorter than 3 bytes#219
Fix ParseBom corrupting stream state on inputs shorter than 3 bytes#219mihajlo-k wants to merge 1 commit into
Conversation
|
The created documentation from the pull request is available at: docu-html |
There was a problem hiding this comment.
Pull request overview
Fixes a Vajson JSON parsing edge case where BOM probing on streams shorter than 3 bytes leaves the input stream in a failed state, breaking subsequent parsing.
Changes:
- Add a boundary-value test to ensure parsing succeeds for a 2-byte JSON buffer (
{}) that is shorter than the UTF-8 BOM length. - Add a regression test to ensure a UTF-8 BOM-prefixed buffer is parsed correctly.
- Update
JsonData::ParseBom()to recover the stream state when the BOM check read fails.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
score/json/internal/parser/vajson/vajson_parser_test.cpp |
Adds regression tests for short buffers and UTF-8 BOM-prefixed inputs. |
score/json/internal/parser/vajson/vajson_impl/reader/json_data.cpp |
Clears/rewinds stream on failed BOM read so parsing can continue on short inputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // ReadString failed (e.g., stream shorter than BOM). | ||
| // Ensure stream is in a usable state for subsequent parsing. | ||
| this->stream_.get().clear(); | ||
| this->stream_.get().seekg(0, std::ios::beg); | ||
| } |
There was a problem hiding this comment.
it is very unlikely that that would happen. but we could add this to be more defensive. however, not sure if the branching with !stream.fail() is really necessary. even in the unlikely case when the stream is still in failure - seekg is just harmless no-op.
ParseBom always attempts to read 3 bytes for the UTF-8 BOM check. When the input stream has fewer than 3 bytes, stream.read() sets failbit|eofbit. The subsequent seekg() in RewindIf is a no-op when failbit is set (per C++ standard), leaving the stream in a failed state with consumed bytes not restored. All subsequent parsing fails. Fix: In ParseBom, when ReadString fails, clear the stream state and seek to position 0 to ensure the stream is usable for parsing.
145fced to
a461c50
Compare
ParseBom always attempts to read 3 bytes for the UTF-8 BOM check. When the input stream has fewer than 3 bytes, stream.read() sets failbit|eofbit. The subsequent seekg() in RewindIf is a no-op when failbit is set (per C++ standard), leaving the stream in a failed state with consumed bytes not restored. All subsequent parsing fails.
Fix: In ParseBom, when ReadString fails, clear the stream state and seek to position 0 to ensure the stream is usable for parsing.