Skip to content

Conversation

@xanderbailey
Copy link

@xanderbailey xanderbailey commented Jan 14, 2026

Add Core Encryption Primitives for Iceberg Encryption Support.

Part of #2034

Summary

This PR introduces the foundational cryptographic primitives needed for implementing encryption in iceberg-rust, providing AES-GCM encryption operations that match the Java implementation's behavior and data format.

Motivation

Iceberg's Java implementation supports table-level encryption to protect sensitive data at rest. To achieve feature parity and ensure interoperability between Java and Rust implementations, we need to build encryption support from the ground up. This PR provides the core cryptographic operations that will serve as the foundation for the complete encryption feature.

Changes

New Module: encryption

Added a new encryption module with core AES-GCM cryptographic operations:

  • encryption/crypto.rs - Core encryption implementation
    • EncryptionAlgorithm enum supporting AES-128-GCM as this is the only algorithm currently supported in arrow parquet
    • SecureKey struct with automatic memory zeroization for security
    • AesGcmEncryptor providing encrypt/decrypt operations with AAD support

Key Features

  1. Java-Compatible Format: Ciphertext format matches Java's implementation exactly:
    [12-byte nonce][encrypted data][16-byte GCM authentication tag]
  2. This ensures files encrypted by Java can be decrypted by Rust and vice versa.
  3. Secure Key Handling: Uses the zeroize crate to automatically clear encryption keys from memory when dropped, preventing key material from lingering in memory.
  4. Additional Authenticated Data (AAD): Full support for AAD to ensure integrity of associated metadata that isn't encrypted.
  5. Comprehensive Testing: 8 tests covering:
    - Round-trip encryption/decryption for both AES-128 and AES-256
    - AAD validation
    - Empty plaintext handling
    - Tamper detection
    - Format compatibility verification

Dependencies Added

  • aes-gcm = "0.10" - Industry-standard AES-GCM implementation
  • zeroize = "1.7" - Secure memory cleanup for encryption keys

Compatibility

This implementation directly corresponds to Java's https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/encryption/Ciphers.java:

Java Class Rust Implementation
Ciphers.AesGcmEncryptor AesGcmEncryptor::encrypt()
Ciphers.AesGcmDecryptor AesGcmEncryptor::decrypt()
EncryptionAlgorithm.AES_GCM EncryptionAlgorithm::Aes128Gcm

Testing

Future Work

This PR is the first in a series to implement full encryption support. Upcoming PRs will add:

  1. Table properties for encryption configuration
  2. Key management interfaces (KeyManagementClient trait)
  3. EncryptionManager implementation
  4. Native Parquet encryption integration
  5. AWS KMS support
  6. Integration with Table and FileIO

Review Notes

  • This PR is intentionally minimal and self-contained
  • No existing code paths are modified - this is purely additive
  • The module is public but won't be used until future PRs wire it up
  • Format compatibility with Java has been prioritized to ensure interoperability

Which issue does this PR close?

What changes are included in this PR?

Are these changes tested?

Yes

@xanderbailey xanderbailey force-pushed the xb/core_encryption branch 4 times, most recently from 44020d9 to c5299d9 Compare January 14, 2026 17:53
@xanderbailey xanderbailey changed the title Add crypto for AES-GCM [1/N] Support encryption: Add crypto for AES-GCM Jan 14, 2026
@mbutrovich mbutrovich self-requested a review January 14, 2026 20:26
@mbutrovich
Copy link
Collaborator

This is awesome! I will take a look since I did a lot of the PME support in Comet. Regarding AES-256-GCM support, last I looked Arrow-rs' Parquet reader only supported 128. Is that not the case anymore?

@xanderbailey
Copy link
Author

You are correct by the looks of it! https://github.com/apache/arrow-rs/blob/main/parquet/src/encryption/ciphers.rs

@xanderbailey xanderbailey changed the title [1/N] Support encryption: Add crypto for AES-GCM feat(encryption) [1/N] Support encryption: Add crypto for AES-GCM Jan 15, 2026
@hsiang-c
Copy link

hsiang-c commented Jan 16, 2026

@xanderbailey Thank you for the great work on encryption support!

Regarding arror-rs, I removed the hardcoded AES-128 constraint w/ apache/arrow-rs#9203, hope that helps a bit. Because we're using PME with AES-256

cc @mbutrovich

@xanderbailey
Copy link
Author

This helps a bunch thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants