-
Notifications
You must be signed in to change notification settings - Fork 299
Pull requests: ROCm/aiter
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Optimize dynamic_per_group_scaled_quant via compile-time group_size t…
#3037
opened May 5, 2026 by
RichardChamberlain1
Contributor
•
Draft
add rope/rotate_activation/fp4_quant_inplace fused kernel for dsv4
#3035
opened May 5, 2026 by
junhaha666
Contributor
Loading…
1 task
[TRITON] Add scattered-pointer Q4_K_M MoE matvec kernel for streaming inference
#3034
opened May 5, 2026 by
ssubbotin
Loading…
3 tasks done
Fix sqrsum store race condition in mhc_pre_gemm_sqrsum_kernel
#3033
opened May 5, 2026 by
kkHuang-amd
Contributor
Loading…
2 of 3 tasks
fix(batch_prefill): OOB page table read fix via CK bump + regression tests (AICK-1171)
#3032
opened May 5, 2026 by
Jeff-Huang
Contributor
Loading…
4 tasks done
attention.cu: guard out-of-head Q load in mfma16 paged-attention kernel
#3031
opened May 5, 2026 by
JohnQinAMD
Loading…
1 task
Fix batched_model_benchmark_shapes returning hidden/intermediate sizes swapped
#3019
opened May 4, 2026 by
apicciau
Contributor
Loading…
1 task done
Remove async copy override from Triton test workflow
ci:triton-300x
ci:triton-355
#3018
opened May 4, 2026 by
nidal567
Contributor
Loading…
1 task done
CI: drop signal artifact, limit downstream on Checks run state
#3017
opened May 4, 2026 by
leo-automation
Loading…
[MoE] Cache split-K scratch buffers to avoid per-call hipMalloc
#3016
opened May 4, 2026 by
frida-andersson
Contributor
Loading…
test: xfail test_moe_routing on gfx950 for known topk tie-breaking mismatch
#3015
opened May 4, 2026 by
sunway513
Collaborator
Loading…
3 tasks
Add fp8 mla decode kernel for sub_kv=64, sub_qh=8 (gqa_ratio=8, qseql…
#3014
opened May 3, 2026 by
JohnNikolay84
Contributor
Loading…
1 task
fix(mla): bypass fp8 qseqlen2 kernel precision issue on gfx950
#3013
opened May 3, 2026 by
fangche123
Contributor
Loading…
1 task
ci: replace deprecated zmq with pyzmq in CI scripts
#3007
opened May 3, 2026 by
sunway513
Collaborator
Loading…
Add HipKittens based nhead=32 MLA kernel on MI35x /
gfx950
#3003
opened May 1, 2026 by
hubertlu-tw
Contributor
•
Draft
8 of 9 tasks
ci(nightly): fix wheel/image ABI mismatch + 0-test false-pass (run 25202894144)
#3002
opened May 1, 2026 by
sunway513
Collaborator
Loading…
Replace QH16 bf16 kernel with a new one that does not use ptr_RP
#2999
opened May 1, 2026 by
JohnNikolay84
Contributor
Loading…
1 task
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.