Commit graph

1 commit

Author SHA1 Message Date
Daniel Hiltgen e823bff873
gemma4: enable flash attention (#15378)
Backport GGML kernels so we can enable flash attention for the gemma 4 model on
Metal and CUDA.
2026-04-07 08:12:36 -07:00