Skip to content

Commit 5ae44f2

Browse files
timsaucerclaude
andcommitted
docs(codec): justify synthetic input field names and nullability
Encoders for scalar, aggregate, and window UDFs build IPC input schemas from `Field::new(format!("arg_{i}"), dt, true)` — synthetic names, unconditional nullability. Add a comment at each site explaining that the field wrapper is only a transport for the `DataType`: the receiver immediately collapses these fields back to `Vec<DataType>` when reconstructing `Signature::Exact`, which cannot encode names or nullability. Setting realistic values here would be discarded on decode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent d578c85 commit 5ae44f2

1 file changed

Lines changed: 21 additions & 0 deletions

File tree

crates/core/src/codec.rs

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -481,6 +481,13 @@ fn encode_python_scalar_udf(py: Python<'_>, udf: &PythonFunctionScalarUDF) -> Py
481481
)));
482482
}
483483
};
484+
// Input fields exist only as a transport for the per-arg
485+
// `DataType`. Names default to `arg_{i}` and nullability to
486+
// `true` because the underlying `TypeSignature::Exact` cannot
487+
// express either — the receiver immediately collapses these
488+
// fields back to `Vec<DataType>` when reconstructing the
489+
// `Signature`, so any nullability or metadata set here would be
490+
// discarded.
484491
let input_fields: Vec<Field> = input_dtypes
485492
.into_iter()
486493
.enumerate()
@@ -667,6 +674,13 @@ fn encode_python_window_udf(py: Python<'_>, udf: &PythonFunctionWindowUDF) -> Py
667674
)));
668675
}
669676
};
677+
// Input fields exist only as a transport for the per-arg
678+
// `DataType`. Names default to `arg_{i}` and nullability to
679+
// `true` because the underlying `TypeSignature::Exact` cannot
680+
// express either — the receiver immediately collapses these
681+
// fields back to `Vec<DataType>` when reconstructing the
682+
// `Signature`, so any nullability or metadata set here would be
683+
// discarded.
670684
let input_fields: Vec<Field> = input_dtypes
671685
.into_iter()
672686
.enumerate()
@@ -795,6 +809,13 @@ fn encode_python_agg_udf(py: Python<'_>, udf: &PythonFunctionAggregateUDF) -> Py
795809
)));
796810
}
797811
};
812+
// Input fields exist only as a transport for the per-arg
813+
// `DataType`. Names default to `arg_{i}` and nullability to
814+
// `true` because the underlying `TypeSignature::Exact` cannot
815+
// express either — the receiver immediately collapses these
816+
// fields back to `Vec<DataType>` when reconstructing the
817+
// `Signature`, so any nullability or metadata set here would be
818+
// discarded.
798819
let input_fields: Vec<Field> = input_dtypes
799820
.into_iter()
800821
.enumerate()

0 commit comments

Comments
 (0)