-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
devops: add missing GGML_BACKEND_DL and GGML_CPU_ALL_VARIANTS to Dockerfiles (#23292)
#23600
opened May 24, 2026 by
mrigankad
Loading…
ui: use password input for API key to prevent browser autofill (#23254)
#23599
opened May 24, 2026 by
mrigankad
Loading…
docs: fix dead link to settings config in server README (#23093)
#23598
opened May 24, 2026 by
mrigankad
Loading…
common: add LLAMA_ARG_API_KEY_FILE env var for --api-key-file (#23165)
#23597
opened May 24, 2026 by
mrigankad
Loading…
common: skip comment lines in --api-key-file (#23166)
#23596
opened May 24, 2026 by
mrigankad
Loading…
Parallelize quant LUT init
ggml
changes relating to the ggml tensor library for machine learning
#23595
opened May 24, 2026 by
jeffbolznv
Contributor
Loading…
ggml-webgpu: Add MMVQ path for Q4/Q8/Q2_K/Q4_K
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
#23594
opened May 24, 2026 by
yomaytk
Contributor
Loading…
ggml: fix AVX-512 BF16 build with clang-cl
ggml
changes relating to the ggml tensor library for machine learning
#23593
opened May 24, 2026 by
marcusds
Loading…
Update build.md with Fedora Vulkan dependencies
documentation
Improvements or additions to documentation
#23584
opened May 23, 2026 by
JCTRoth
Loading…
cmake : error when LLAMA_BUILD_APP=ON and LLAMA_BUILD_TOOLS=OFF
build
Compilation issues
#23580
opened May 23, 2026 by
Pento95
Loading…
CUDA: native 4-bit float quant (Blackwell PP +40%)
examples
#23572
opened May 23, 2026 by
sanmai
Contributor
Loading…
vulkan: Refactor vk_queue to use per-instance mutexes and unique handles
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#23570
opened May 23, 2026 by
winstonma
Contributor
Loading…
Metal : detect Apple SoC at backend init
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#23566
opened May 23, 2026 by
forforever73
Contributor
Loading…
server: fix --cache-ram not preventing RAM OOM
examples
server
#23561
opened May 23, 2026 by
zzhenyao
Loading…
opencl: add basic support for q5_0 and q5_1
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
model: Granite4 Vision
examples
model
Model specific
python
python script changes
#23545
opened May 22, 2026 by
gabe-l-hart
Collaborator
Loading…
4 tasks
vulkan: use GL_NV_cooperative_matrix_decode_vector for faster matmul
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#23541
opened May 22, 2026 by
jeffbolznv
Contributor
Loading…
Hexagon: OP_GATED_DELTA_NET K>1 support
ggml
changes relating to the ggml tensor library for machine learning
Hexagon
#23531
opened May 22, 2026 by
ymcki
Contributor
Loading…
CUDA: Check PTX version on host side to guard PDL dispatch
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#23530
opened May 22, 2026 by
ORippler
Collaborator
Loading…
ggml-cuda: tune RDNA3 Q4_K MMVQ nwarps
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#23528
opened May 22, 2026 by
ravel7524
Contributor
Loading…
ci : add approval gate to remaining workflows
devops
improvements to build systems and github actions
document that only one on-device state can be saved per sequence
#23520
opened May 22, 2026 by
TimNN
Contributor
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.