pac_clip_sum, pac_clip_min, pac_clip_max: Per-User Contribution Clipping for PAC Aggregates by ila · Pull Request #13 · cwida/pac

ila · 2026-04-01T13:24:52Z

Adds contribution clipping to PAC aggregates. When pac_clip_support is set, outlier contributions from users with too few distinct contributors at a given magnitude level are hard-zeroed, preventing variance side-channel attacks. Supports integer, float, double, and HUGEINT types for SUM, and integer/float/double for MIN/MAX.

pac_clip_sum (integer)
Level-based magnitude decomposition with SWAR bitslice counters. Each value is routed to a level based on its magnitude (62 levels, 2-bit shift = 4x per level, covering the full 128-bit range). Each level maintains 64 SWAR uint16 counters + overflow uint32 counters + a 64-bit distinct-contributor bitmap. At finalization, levels with fewer distinct contributors than pac_clip_support contribute nothing (hard-zero). Signed values are handled by splitting into separate positive and negative accumulators.

Float/double support
Floating-point values are converted to int64 before entering the integer-based level machinery via ScaleFloatToInt64<FLOAT_TYPE, SHIFT>. The scale factors are powers of 2 (2^20 for float, 2^27 for double) so the multiplication is exact in IEEE 754 — no rounding error is introduced by the scaling itself. Branchless clamping to [INT64_MIN, INT64_MAX] handles overflow. At finalization, the accumulated integer result is divided by the scale factor to recover the original floating-point range. This approach preserves ~6 significant digits for float and ~8 for double, which is sufficient for the PAC noise regime where the noise magnitude exceeds the lost precision.

pac_clip_min / pac_clip_max
Level-based clipping for MIN/MAX using int8_t extremes per level instead of uint16 counters. Each value is routed to a level by magnitude (same 62-level, 2-bit-shift structure as clip_sum), then an arithmetic right shift compresses it to int8_t [-128, 127]. The sign is preserved because arithmetic shift extends the sign bit.

Each level stores:

8 × uint64_t SWAR-packed int8_t extremes (64 worlds × 1 byte each)
1 × uint64_t bitmap — distinct-contributor tracking, same birthday-paradox estimation as clip_sum

At finalization, per-level extremes are reconstructed by left-shifting by level * 2 bits. Levels below the pac_clip_support threshold are excluded (hard-zero). The final result is the worst (smallest for MIN, largest for MAX) surviving extreme across all non-zeroed levels.

BOUNDOPT optimization: Each level tracks the worst-of-64 extreme as a scalar level_bounds[k]. During update, if the incoming shifted value cannot beat the current bound, the expensive SWAR update is skipped entirely. The bound is recomputed every 64 updates. This optimization is critical for skewed distributions where most values land in the same few levels.

Inline level optimization: One level can be stored inline in the state struct (overlapping the last 9 pointer slots = 72 bytes), avoiding an arena allocation for the common case where only one level is active.

Tests

test/sql/pac_clip_sum.test (485 lines): level boundaries, HUGEINT, over-clipping, multi-group, float/double scaling, mixed types
test/sql/pac_clip_min_max.test (282 lines): basic min/max, signed values, float/double, hard-zero at low support, NULL handling

Change suffix attenuation from soft-clamp (scale by 16^distance) to hard-zero (skip entirely). Unsupported magnitude levels now contribute nothing to the result, fully eliminating the variance side-channel. Attack results with clip_support=2: - Small filter (3-4 users): 96% → 47% (random) - 20K small items: 96% → 53% (random) - Std ratio in/out: 90x → 0.87x Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Finer-grained magnitude levels (2-bit bands, 4x per level) allow the clipping mechanism to catch moderate outliers that were previously invisible within the same 16x-wide level. A 10x outlier (50k vs 5k normal) now lands in a different level and gets hard-zeroed. Changes: - PAC2_LEVEL_SHIFT: 4 → 2 - PAC2_NUM_LEVELS: 31 → 32 (covers int64; HUGEINT clamps to level 31) - GetLevel/GetLevel128: divide by 2 instead of 4, clamp to max level - Inline optimization threshold: 13 → 14 - All shift extraction: level << 2 → level << 1 Memory: +8 bytes per state (256 vs 248 byte pointer array). Negligible. Performance: no regression on TPCH Q01 SF1 (1.38s → 1.31s). Security: moderate outlier attack drops from 76.5% to 52.9% (random). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…vior With hard-zero, unsupported outlier levels contribute nothing, so the clipped result equals (not exceeds) the no-outlier baseline. Change > to >=. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Increase PAC2_NUM_LEVELS from 32 to 62 to cover the full 128-bit range without clamping. int64 values naturally use only levels 0-29 (the extra pointer slots remain NULL, no per-level data is allocated). The inline optimization threshold moves from 14 to 44 accordingly. Memory: +240 bytes per state for the pointer array (496 vs 256 bytes). Per-level data allocations are unchanged for int64 workloads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…i-group New test cases: - Level boundary routing (same-level vs cross-level with 4x bands) - HUGEINT outlier clipping (values at 2^70, beyond int64 range) - Negative HUGEINT outlier via neg_state - Over-clipping (clip_support > group size → zero result) - Multi-group with outlier isolated to one group Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fetched from main and added: - Development rules: test coverage, no test removal, codebase-first search, helper function reuse, duckdb submodule is read-only - Reference to the PAC paper (arXiv:2603.15023) - PAC_DEBUG_PRINT usage guidance Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Attack scripts testing the variance side-channel MIA against pac_clip_sum: - clip_attack_test.sh: main suite (small filter, wide filter, 10K users, etc.) - clip_multirow_test.sh: 20K small items user (tests pre-aggregation) - clip_hardzero_stress.sh: stress tests (high trials, composed queries, collusion) - clip_shift2_stress.sh: tests with 4x magnitude levels (shift=2) - clipping_experiment.sh: input clipping (Winsorization) baseline - output_clipping_experiment.sh: post-hoc output clipping baseline - output_clipping_v2_experiment.sh: output clipping before noise - clip_attack_results.md: full evaluation with findings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- CLAUDE.md: added code style rules (clang-tidy naming, clang-format style), attack evaluation section, development rules - .claude/settings.json: PostToolUse hook to auto-run make format-fix after edits - Skills: /run-attacks, /test-clip, /explain-pac, /explain-dp, /explain-pac-ddl Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>