Skip to content

fix: Decode avro with negative count and no tags avro issue#865

Open
speeddragon wants to merge 2 commits into
edgefrom
fix/tx_header_parse
Open

fix: Decode avro with negative count and no tags avro issue#865
speeddragon wants to merge 2 commits into
edgefrom
fix/tx_header_parse

Conversation

@speeddragon
Copy link
Copy Markdown
Collaborator

@speeddragon speeddragon commented Apr 17, 2026

Requesting https://arweave.net/raw/sfuxzQEEIFo5w6swrIPNjqUXCkaRm1BiuP5E3tmuNeU causes the header to be parsed incorrectly, leading to issues with non-UTF-8 binaries.

What's happening:

The Apache Avro array encoding has two forms:

  1. Short-form: positive Count → Count items follow directly
  2. Long-form: negative Count → abs(Count) items follow, but preceded by a zigzag-encoded byte size of the block (so readers can skip it)

The original decode_avro_tags only handled positive counts. The TXID sfuxzQEEIFo5w6swrIPNjqUXCkaRm1BiuP5E3tmuNeU is a real-world transaction whose tags were encoded using the long-form block (negative count), causing a crash/mismatch on deserialization.

Your current branch already has the fix (lines 594–598):

  decode_avro_tags(Binary, Count) when Count < 0 ->  
      {_ByteBlockSize, Rest} = decode_zigzag(Binary),
      decode_avro_tags(Rest, -Count);

This is standard Avro spec, not a custom encoding — it reads and discards the byte size, then recurses with abs(Count). The test at the bottom of the diff encodes the actual TX binary from that TXID and asserts the two expected tags (IPFS-Hash and Content-Type).

The fix is correct and complete. Nothing else needs to change — it's a missing clause in the existing Avro decoder, not a separate custom decoder.


This includes a second fix, on tx with no tags, like 8cjDy2khfMsc3hrvGp7PrLVYfD_4aYQxEILNSZ0Pv74

@speeddragon speeddragon force-pushed the fix/tx_header_parse branch from 802b7f1 to d786bbe Compare April 20, 2026 18:22
@speeddragon speeddragon changed the base branch from neo/edge to edge April 20, 2026 18:22
@speeddragon speeddragon force-pushed the fix/tx_header_parse branch from d786bbe to 16e0b53 Compare May 14, 2026 18:21
@speeddragon speeddragon changed the title fix: Decode avro with negative count fix: Decode avro with negative count and no tags avro issue May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant