Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix Bad Substitution Error examples SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22737 opened May 6, 2026 by dogunbound Loading…
SYCL: reduce allocation overhead during flash attention ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22732 opened May 5, 2026 by sanmai Loading…
opencl: add q4_0 MoE GEMM for Adreno ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#22731 opened May 5, 2026 by shawngu-quic Contributor Loading…
Write a readme on Multi-GPU usage in llama.cpp documentation Improvements or additions to documentation
#22729 opened May 5, 2026 by gaugarg-nv Contributor Loading…
llama : extend embeddings API model Model specific
#22728 opened May 5, 2026 by ggerganov Member Draft
metal : promote mul_mv/mul_mm batch divisors to function constants Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#22711 opened May 5, 2026 by guyfischman Loading…
Fuse rms_norm, mul, quantize_q8_1 ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#22710 opened May 5, 2026 by lnigam Contributor Loading…
Feat/qlora training Apple Metal https://en.wikipedia.org/wiki/Metal_(API) examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes Vulkan Issues specific to the Vulkan backend
#22705 opened May 5, 2026 by srossitto79 Draft
Feat/backward mul mat Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs Vulkan Issues specific to the Vulkan backend
#22704 opened May 5, 2026 by srossitto79 Draft
vulkan: Check shared memory size for mmq shaders ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#22693 opened May 4, 2026 by jeffbolznv Contributor Loading…
llama: add pshard runtime for plan switching and streamed weights examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#22692 opened May 4, 2026 by aukarande Loading…
tools: add llama-pshard-plan-params for token-tiered placement planning examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#22691 opened May 4, 2026 by aukarande Loading…
tests: add BF16 non-contig coverage for MUL_MAT permutations testing Everything test related
#22689 opened May 4, 2026 by ServeurpersoCom Contributor Loading…
ggml : use CL_DEVICE_GLOBAL_MEM_SIZE as estimate for OpenCL --fit ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#22688 opened May 4, 2026 by fl0rianr Loading…
vulkan: optimize operations in the IM2COL shader ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#22685 opened May 4, 2026 by daniandtheweb Contributor Loading…
ggml-zendnn : adaptive fallback to CPU backend for small batch sizes AMD ZenDNN Issues related to the AMD ZenDNN backend ggml changes relating to the ggml tensor library for machine learning
#22681 opened May 4, 2026 by z-sachin Loading…
ci: validate model naming convention devops improvements to build systems and github actions
#22680 opened May 4, 2026 by ngxson Contributor Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.