Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)
-
Updated
Sep 26, 2025 - Python
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)
A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"
Bio-inspired LLM architecture achieving ~5x reduction in Attention FLOPs via Context-Aware Target-Sparsity Routing. Features custom OpenAI Triton kernels optimized for RTX 5090, delivering 1.41M tok/s throughput.
The ARL Hierarchical MultiScale Framework (ARL-HMS) is a software library for development of multiscale models on heterogeneous high-performance computing systems.
Model implementation for "Adaptive computation as a new mechanism of dynamic human attention"
Lightweight PyTorch implementation of Mixture-of-Recursions with Expert-Choice & Token-Choice routing | Runs on your laptop!
Add a description, image, and links to the adaptive-computation topic page so that developers can more easily learn about it.
To associate your repository with the adaptive-computation topic, visit your repo's landing page and select "manage topics."