Skip to content

fix(eap-items): Serialize EAPValue as untagged in attributes_array#7775

Open
phacops wants to merge 4 commits intomasterfrom
fix/eap-items-untagged-array-serialization
Open

fix(eap-items): Serialize EAPValue as untagged in attributes_array#7775
phacops wants to merge 4 commits intomasterfrom
fix/eap-items-untagged-array-serialization

Conversation

@phacops
Copy link
Contributor

@phacops phacops commented Feb 27, 2026

Add #[serde(untagged)] to the EAPValue enum so that array attribute
values in attributes_array are serialized as plain values (e.g. 123,
"hello", 3.14, true) instead of tagged objects (e.g. {"Int": 123},
{"String": "hello"}).

The tagged representation included the enum variant name as a key, which
is not expected by downstream consumers that only need the raw values.

Agent transcript: https://claudescope.sentry.dev/share/Q6CFyxDoBsOrL1Z2w3da6gOTLnKW426exXCS65Bhk-Q

Add #[serde(untagged)] to the EAPValue enum so that array attribute
values are serialized as plain values (e.g. 123, "hello") instead of
tagged objects (e.g. {"Int": 123}, {"String": "hello"}).

Co-Authored-By: Claude <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/0Zw0YbfHZZPQ_C-v69RIcW-5ch8xOLSf-_07vsI2fts
@phacops phacops marked this pull request as ready for review February 27, 2026 17:35
@phacops phacops requested a review from a team as a code owner February 27, 2026 17:35
@@ -177,6 +177,7 @@ macro_rules! seq_attrs {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Adding #[serde(untagged)] to the EAPValue enum introduces a breaking serialization change, causing downstream Python consumers that expect a tagged object to crash with an AttributeError.
Severity: CRITICAL

Suggested Fix

Either remove the #[serde(untagged)] attribute to revert to the previous tagged serialization format that downstream consumers expect, or update the Python consumer code (specifically the transform_array_value function) to correctly handle the new untagged primitive values.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: rust_snuba/src/processors/eap_items.rs#L177

Potential issue: The addition of `#[serde(untagged)]` to the `EAPValue` enum changes its
serialization format from a tagged object (e.g., `{"Int": 123}`) to a primitive value
(e.g., `123`). A downstream Python consumer, specifically the `transform_array_value`
function, expects the old tagged format and attempts to call `.items()` on each value.
When it receives a primitive type like an integer or string instead of a dictionary, the
call will fail with an `AttributeError`. This will cause runtime crashes in the
`get_trace` and `export_trace_items` RPC endpoints whenever they process items
containing array attributes.

Did we get this right? 👍 / 👎 to inform future reviews.

Update process_arrays to handle the new untagged serialization format
where array values are plain JSON primitives instead of tagged dicts.
Remove the now-unused transform_array_value function.

Co-Authored-By: Claude <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/Ko_bDPVDo8oChJuLF2L67oOcIuDOkjf1V1B2SIwwHU4
@phacops phacops requested a review from a team as a code owner February 27, 2026 17:46
json.loads already returns the correct dict[str, list[Any]] structure,
no need to recreate it.

Co-Authored-By: Claude <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/w74I_wpBjzGAzg8-hafg0nG_ylOEdcbuQbFiod1BtZ0
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

for key, values in parsed.items():
arrays[key] = [transform_array_value(v) for v in values]
return arrays
return json.loads(raw) or {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backward incompatibility with existing tagged array data in storage

High Severity

The simplified process_arrays can no longer handle old tagged data already persisted in ClickHouse. Existing rows have array values in tagged format (e.g., {"Int": 123}), but the removed transform_array_value was the only code that unwrapped those dicts into plain values. Now, queries hitting old data will pass dict objects to _to_any_value or _value_to_attribute, which both raise BadSnubaRPCRequestException on the unhandled dict type. This will break reads until all old data expires per retention policy.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link

@noahsmartin noahsmartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as no-one other than sessions is already using arrays, this looks great, thanks!

Copy link
Contributor

@onewland onewland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine with this change but you should check that this doesn't break expectations for @thetruecpaul / @shashjar who I believe use arrays in occurrences today

toJSONString converts the JSON column through intermediate ClickHouse
types, which coerces mixed-type array elements to strings. CAST to
String preserves the original JSON representation with proper types.

Co-Authored-By: Claude <noreply@anthropic.com>
@phacops phacops force-pushed the fix/eap-items-untagged-array-serialization branch from 077d4a1 to ff1efae Compare February 27, 2026 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants