ollama/fs/ggml
Daniel Hiltgen e823bff873
gemma4: enable flash attention (#15378)
Backport GGML kernels so we can enable flash attention for the gemma 4 model on
Metal and CUDA.
2026-04-07 08:12:36 -07:00
..
ggml.go gemma4: enable flash attention (#15378) 2026-04-07 08:12:36 -07:00
ggml_test.go ggml: fix crash for array head counts 2025-04-27 11:38:06 -07:00
gguf.go ggml: ensure tensor size is valid (#14406) 2026-02-24 21:52:44 -04:00
gguf_test.go ggml: ensure tensor size is valid (#14406) 2026-02-24 21:52:44 -04:00
type.go fs/ggml: fix function name in comment (#12630) 2025-10-15 21:53:38 -07:00