Skip to content

fix(avro): error on complex (non-nullable) unions instead of silently dropping (#777)#808

Merged
zeroshade merged 1 commit into
apache:mainfrom
SAY-5:say5-avro-complex-union-777
May 13, 2026
Merged

fix(avro): error on complex (non-nullable) unions instead of silently dropping (#777)#808
zeroshade merged 1 commit into
apache:mainfrom
SAY-5:say5-avro-complex-union-777

Conversation

@SAY-5
Copy link
Copy Markdown
Contributor

@SAY-5 SAY-5 commented May 10, 2026

Rationale for this change

Fixes #777. ArrowSchemaFromAvro previously silently dropped non-nullable Avro unions (e.g. ["int","string"]) in both arrowSchemafromAvro and iterateFields. Output schemas were missing fields with no warning to the caller.

What changes are included in this PR?

Panic with a clear error message at the two case "union" / case *avro.UnionSchema arms when the union is not a simple nullable. The package's existing recover() in ArrowSchemaFromAvro converts this into a returned error.

Are these changes tested?

Yes, TestComplexUnionReportsError exercises a record with ["int","string"] and asserts a non-nil error mentioning union.

Are there any user-facing changes?

Callers that previously received an incomplete schema for a complex-union field now get an explicit error.

… dropping (apache#777)

Signed-off-by: SAY-5 <saiasish.cnp@gmail.com>
@SAY-5 SAY-5 requested a review from zeroshade as a code owner May 10, 2026 23:26
Copy link
Copy Markdown
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this! Just had one question

Comment thread arrow/avro/schema.go
arrowSchemafromAvro(n)
}
} else {
panic(fmt.Errorf("complex (non-nullable) avro union at '%v' is not supported", n.schemaPath()))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add support and use the arrow union type? We could also do this as a follow up instead.

@SAY-5
Copy link
Copy Markdown
Contributor Author

SAY-5 commented May 12, 2026

I'd lean toward doing the Arrow union type support as a follow-up. This PR is intentionally narrow: it just turns the silent data-loss case (non-nullable unions being dropped) into a clear error, which is a safe incremental step. Mapping Avro unions onto arrow.UnionType (dense vs sparse, type-id assignment, the reader/writer plumbing) is a meatier change that deserves its own PR and tests. Happy to open a tracking issue for it if you'd like.

@zeroshade
Copy link
Copy Markdown
Member

Sounds good, please add a tracking issue for the union support and we can do that as a follow-up.

I'll merge this after the CI passes

@zeroshade zeroshade merged commit bf16827 into apache:main May 13, 2026
40 of 41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Avro] Silently ignores complex unions

2 participants