Skip to content

Commit 0642160

Browse files
timsaucerclaude
andcommitted
docs(codec): clarify strict-mode security scope
The doc on `with_python_udf_inlining(false)` said strict mode "rejects cloudpickle.loads on untrusted from_bytes input", which could be misread as making `pickle.loads(untrusted)` safe. It does not. Strict mode only narrows the codec layer: it stops `Expr::from_bytes` from invoking `cloudpickle.loads` on the inline `DFPY*` payload. The outer pickle stream is still arbitrary code — `pickle.loads` honors any `__reduce__` the bytes name, and an attacker is free to choose one. Spell that out in the doc so callers don't treat the toggle as a substitute for "never pickle.loads untrusted input." The Python-side docstring (`SessionContext.with_python_udf_inlining`) already carries the equivalent caveat; this brings the Rust side in line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 9a265e4 commit 0642160

1 file changed

Lines changed: 12 additions & 0 deletions

File tree

crates/core/src/codec.rs

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,18 @@ impl PythonLogicalCodec {
149149
/// when the codec sits on a session that must produce
150150
/// cross-language wire bytes, or reject `cloudpickle.loads` on
151151
/// untrusted `from_bytes` input.
152+
///
153+
/// Security scope: strict mode (`false`) protects only the codec
154+
/// layer — it stops `Expr::from_bytes` from invoking
155+
/// `cloudpickle.loads` on the inline `DFPY*` payload. It does
156+
/// **not** make `pickle.loads(untrusted_bytes)` safe. Python's
157+
/// pickle protocol permits arbitrary code execution via
158+
/// `__reduce__` (and `Expr.__reduce__` returns
159+
/// `Expr._reconstruct(bytes)` — an honest reducer here, but the
160+
/// outer pickle stream can contain any reducer). Treat every
161+
/// `pickle.loads` on untrusted input as unsafe regardless of this
162+
/// setting; the toggle only narrows the surface inside
163+
/// `from_bytes`.
152164
pub fn with_python_udf_inlining(mut self, enabled: bool) -> Self {
153165
self.python_udf_inlining = enabled;
154166
self

0 commit comments

Comments
 (0)