ollama

mirror of https://github.com/ollama/ollama synced 2026-04-23 08:45:14 +00:00

History

Daniel Hiltgen e823bff873 gemma4: enable flash attention (#15378 ) Backport GGML kernels so we can enable flash attention for the gemma 4 model on Metal and CUDA.		2026-04-07 08:12:36 -07:00
..
ggml	gemma4: enable flash attention (#15378 )	2026-04-07 08:12:36 -07:00
gguf	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-06-20 11:11:40 -07:00
util/bufioutil	next ollama runner (#7913 )	2025-02-13 16:31:21 -08:00
config.go	Add experimental MLX backend and engine with imagegen support (#13648 )	2026-01-08 16:18:59 -08:00