sipsalabs / ultracompress Star 12 Code Issues Pull requests Discussions Lossless 5-bit transformer compression — 19 architectures independently PPL-verified end-to-end (0.6B-405B, dense + MoE + SSM). Hermes-3-405B 1.0066x, Phi-3.5-MoE 1.00129x, Phi-4 1.00506x. SHA-256-verifiable bit-identical reconstruction. OpenAI-compatible API at api.sipsalabs.com. pip install ultracompress python compression cuda inference pytorch transformer lossless quantization mlops deep-tech openai-api llm patent-pending ai-infrastructure 405b consumer-gpu 5-bit sipsa-labs experimental-tech Updated May 20, 2026 Python