Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Parallelize quant LUT init ggml changes relating to the ggml tensor library for machine learning
#23595 opened May 24, 2026 by jeffbolznv Contributor Loading…
ggml-webgpu: Add MMVQ path for Q4/Q8/Q2_K/Q4_K ggml changes relating to the ggml tensor library for machine learning WebGPU
#23594 opened May 24, 2026 by yomaytk Contributor Loading…
ggml: fix AVX-512 BF16 build with clang-cl ggml changes relating to the ggml tensor library for machine learning
#23593 opened May 24, 2026 by marcusds Loading…
chore: reuse find() iterator in LLM_TN_IMPL::str()
#23591 opened May 24, 2026 by fansehep Loading…
Update build.md with Fedora Vulkan dependencies documentation Improvements or additions to documentation
#23584 opened May 23, 2026 by JCTRoth Loading…
cmake : error when LLAMA_BUILD_APP=ON and LLAMA_BUILD_TOOLS=OFF build Compilation issues
#23580 opened May 23, 2026 by Pento95 Loading…
Static quantize mtp layers
#23575 opened May 23, 2026 by de-wim Draft
CUDA: native 4-bit float quant (Blackwell PP +40%) examples
#23572 opened May 23, 2026 by sanmai Contributor Loading…
vulkan: Refactor vk_queue to use per-instance mutexes and unique handles ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#23570 opened May 23, 2026 by winstonma Contributor Loading…
Metal : detect Apple SoC at backend init Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#23566 opened May 23, 2026 by forforever73 Contributor Loading…
opencl: add basic support for q5_0 and q5_1 ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#23548 opened May 22, 2026 by shaofeiqi Contributor Draft
model: Granite4 Vision examples model Model specific python python script changes
#23545 opened May 22, 2026 by gabe-l-hart Collaborator Loading…
4 tasks
vulkan: use GL_NV_cooperative_matrix_decode_vector for faster matmul ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#23541 opened May 22, 2026 by jeffbolznv Contributor Loading…
Hexagon: OP_GATED_DELTA_NET K>1 support ggml changes relating to the ggml tensor library for machine learning Hexagon
#23531 opened May 22, 2026 by ymcki Contributor Loading…
CUDA: Check PTX version on host side to guard PDL dispatch ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#23530 opened May 22, 2026 by ORippler Collaborator Loading…
ggml-cuda: tune RDNA3 Q4_K MMVQ nwarps ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#23528 opened May 22, 2026 by ravel7524 Contributor Loading…
ci : add approval gate to remaining workflows devops improvements to build systems and github actions
#23526 opened May 22, 2026 by ggerganov Member Draft
2 tasks done
document that only one on-device state can be saved per sequence
#23520 opened May 22, 2026 by TimNN Contributor Loading…
ProTip! Follow long discussions with comments:>50.