Skip to content

Pull requests: cactus-compute/cactus

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

LFM NPU Fallback + Tokenizer Fixes
#605 opened Apr 22, 2026 by ParkiratS Collaborator Loading…
Apple GPU Support
#604 opened Apr 22, 2026 by justinl66 Member Draft
Karen/tq
#603 opened Apr 21, 2026 by kar-m Collaborator Loading…
Split native LLM ownership into Model and Context
#602 opened Apr 21, 2026 by aarnav-11 Loading…
mlx added
#587 opened Apr 15, 2026 by kar-m Collaborator Draft
fix gemma4 audio/vision crash when NPU falls back to CPU
#586 opened Apr 15, 2026 by ncylich Collaborator Loading…
4 tasks done
Gemma sp tokenizer
#583 opened Apr 15, 2026 by aarnav-11 Loading…
Graph remaining ops
#578 opened Apr 14, 2026 by cattermelon1234 Contributor Loading…
Turboquant attention kernel
#573 opened Apr 13, 2026 by jrajala6 Contributor Loading…
Follow-up: consolidate sampling APIs after #560
#569 opened Apr 10, 2026 by DuFanYin Contributor Loading…
Qualcomm NPU Support
#563 opened Apr 7, 2026 by justinl66 Member Draft
Structured Generation
#555 opened Apr 6, 2026 by mhayes853 Contributor Loading…
Stateful chunked TDT streaming transcription
#552 opened Apr 5, 2026 by rshemet Collaborator Loading…
3 of 4 tasks
Add IBM Granite 3.3 model support
#541 opened Mar 31, 2026 by vyomshah05 Contributor Loading…
Diarization
#537 opened Mar 26, 2026 by ParkiratS Collaborator Draft
Per-layer KV heads, attention logit capping, MoE per-expert scales, NPU multi-input
#526 opened Mar 19, 2026 by ncylich Collaborator Loading…
4 tasks done
Accelerate MatMul FP16 for Apple GPUs
#523 opened Mar 17, 2026 by aarav18 Contributor Loading…
reverting attn exp calculations to before 3n
#511 opened Mar 9, 2026 by ncylich Collaborator Loading…
Fix gemma multi tool call and logit biasing
#510 opened Mar 8, 2026 by lennartvoelz Contributor Loading…
new approximation for exponent on (0,1)
#500 opened Mar 6, 2026 by kar-m Collaborator Loading…
Optimized Attention
#480 opened Mar 2, 2026 by ncylich Collaborator Loading…
Axis reductions fixes
#473 opened Feb 28, 2026 by cattermelon1234 Contributor Draft
Benchmarking against other quantized kernels
#458 opened Feb 26, 2026 by ncylich Collaborator Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.