Skip to content

fix: drop polonius-the-crab, inline unsafe reborrow#404

Open
catinspace-au wants to merge 3 commits intoClickHouse:mainfrom
hyperi-io:hyperi/remediation-polonius
Open

fix: drop polonius-the-crab, inline unsafe reborrow#404
catinspace-au wants to merge 3 commits intoClickHouse:mainfrom
hyperi-io:hyperi/remediation-polonius

Conversation

@catinspace-au
Copy link
Copy Markdown
Contributor

polonius-the-crab has so many abandonment issues it needs therapy --
two RustSec advisories across its transitive deps, the whole tree
stagnant for 12+ months. Four crates for a macro that expands to
one line of unsafe.

Inlined the raw-pointer reborrow with a full safety proof and
documentation of every alternative we tried (and why none compiled
without unsafe). Removes 4 crates, clears 2 advisories.

Derek added 3 commits March 19, 2026 15:17
- fxhash (RUSTSEC-2025-0057, unmaintained) -> rustc-hash 2.x
- linked-hash-map (no releases since 2020) -> indexmap (already a dep)
- cargo update for semver-compatible bumps
…436)

- Drop polonius-the-crab and 3 transitive deps (paste, higher-kinded-types,
  macro_rules_attribute) — clears RUSTSEC-2024-0436 (paste unmaintained)
- Inline the raw-pointer reborrow that polonius wrapped behind a macro
- Fix rustfmt.toml edition 2021 -> 2024 to match Cargo.toml
4 mock-based tests (no ClickHouse needed) + 3 integration tests
covering the unsafe reborrow paths in poll_next and Next::poll:
- single row, multi-row, empty result, fetch_all/fetch_one (mock)
- large result spanning chunks, borrowed rows, small block size (integration)
Comment thread src/cursors/row.rs
Comment on lines +103 to +131
// Why the unsafe reborrow?
//
// We hate unsafe. Genuinely. But NLL (the current borrow checker) can't
// see that `bytes` is dead in the NotEnoughData branch of this loop.
// The returned value borrows from `bytes`, so NLL extends that borrow
// to the function's return lifetime — blocking the `bytes.extend()`
// that only runs when no value exists. Classic Polonius limitation:
// https://github.com/rust-lang/rust/issues/51132
//
// This used to be the `polonius-the-crab` crate, which wraps the exact
// same raw-pointer reborrow behind a macro. We dropped it because
// polonius-the-crab has so many abandonment issues it needs therapy:
// - `paste` transitive dep: RUSTSEC-2024-0436 (unmaintained)
// - `polonius-the-crab` itself: no meaningful commits in 12+ months
// - `higher-kinded-types`, `macro_rules_attribute`: same story
// Four stagnant crates, two RustSec advisories, all for a macro that
// expands to one line of unsafe. Two lines of unsafe instead of four
// crates is a good return.
//
// We properly tried to avoid this:
// - TryRow enum (borrow still escapes via return type — same error)
// - async-only next() + poll_next_owned for Stream (same NLL issue)
// - interior mutability in BytesExt via UnsafeCell (3x the diff,
// same amount of actual unsafe, just hidden — not actually better)
// - double deserialisation / probe-then-extract (~2x deser cost on
// the happy path — non-starter for a perf-sensitive cursor)
// None compiled without unsafe somewhere, or had unacceptable costs.
//
// When Polonius lands in stable rustc, rip this out. We'll buy it a beer.
Copy link
Copy Markdown
Contributor

@abonander abonander Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this comment will read like an official statement from ClickHouse, I feel the tone needs to be adjusted somewhat:

Suggested change
// Why the unsafe reborrow?
//
// We hate unsafe. Genuinely. But NLL (the current borrow checker) can't
// see that `bytes` is dead in the NotEnoughData branch of this loop.
// The returned value borrows from `bytes`, so NLL extends that borrow
// to the function's return lifetime — blocking the `bytes.extend()`
// that only runs when no value exists. Classic Polonius limitation:
// https://github.com/rust-lang/rust/issues/51132
//
// This used to be the `polonius-the-crab` crate, which wraps the exact
// same raw-pointer reborrow behind a macro. We dropped it because
// polonius-the-crab has so many abandonment issues it needs therapy:
// - `paste` transitive dep: RUSTSEC-2024-0436 (unmaintained)
// - `polonius-the-crab` itself: no meaningful commits in 12+ months
// - `higher-kinded-types`, `macro_rules_attribute`: same story
// Four stagnant crates, two RustSec advisories, all for a macro that
// expands to one line of unsafe. Two lines of unsafe instead of four
// crates is a good return.
//
// We properly tried to avoid this:
// - TryRow enum (borrow still escapes via return type — same error)
// - async-only next() + poll_next_owned for Stream (same NLL issue)
// - interior mutability in BytesExt via UnsafeCell (3x the diff,
// same amount of actual unsafe, just hidden — not actually better)
// - double deserialisation / probe-then-extract (~2x deser cost on
// the happy path — non-starter for a perf-sensitive cursor)
// None compiled without unsafe somewhere, or had unacceptable costs.
//
// When Polonius lands in stable rustc, rip this out. We'll buy it a beer.
// We generally avoid unsafe. But NLL (the current borrow checker) can't
// see that `bytes` is dead in the NotEnoughData branch of this loop.
// The returned value borrows from `bytes`, so NLL extends that borrow
// to the function's return lifetime — blocking the `bytes.extend()`
// that only runs when no value exists. The fix is blocked on Polonius:
// https://github.com/rust-lang/rust/issues/51132
//
// This used to be the `polonius-the-crab` crate, which wraps the exact
// same raw-pointer reborrow behind a macro. We dropped it because
// polonius-the-crab and its dependencies appear to be unmaintained:
//
// - `paste` transitive dep: RUSTSEC-2024-0436 (unmaintained)
// - `polonius-the-crab` itself: no meaningful commits in 12+ months
// - `higher-kinded-types`, `macro_rules_attribute`: same story
///
// Four unmaintained crates, two RustSec advisories, all for a macro
// that expands to one line of unsafe. Two lines of unsafe instead of
// four crates is a good return.
//
// We tried to avoid this:
// - TryRow enum (borrow still escapes via return type — same error)
// - async-only next() + poll_next_owned for Stream (same NLL issue)
// - interior mutability in BytesExt via UnsafeCell (3x the diff,
// same amount of actual unsafe, just hidden — not actually better)
// - double deserialisation / probe-then-extract (~2x deser cost on
// the happy path — non-starter for a perf-sensitive cursor)
// None compiled without unsafe somewhere, or had unacceptable costs.
//
// When Polonius lands in stable rustc, it should be possible to replace this
// with more intuitive, safe code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem here. A lot of the code is ported from dfe (internal) so assume I'm 100% fine either way rewriting my less formal code comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants