mirror of https://github.com/ollama/ollama synced 2026-04-23 08:45:14 +00:00

History

jmorganca e23ddd84b8 x/grammar: add experimental GPU accelerated constrained decoding package		2026-01-11 00:50:11 -08:00
..
agent	x: request access for all commands, add welcome message (#13662 )	2026-01-09 18:20:39 -08:00
cmd	x: request access for all commands, add welcome message (#13662 )	2026-01-09 18:20:39 -08:00
grammar	x/grammar: add experimental GPU accelerated constrained decoding package	2026-01-11 00:50:11 -08:00
imagegen	x/grammar: add experimental GPU accelerated constrained decoding package	2026-01-11 00:50:11 -08:00
kvcache	Add experimental MLX backend and engine with imagegen support (#13648 )	2026-01-08 16:18:59 -08:00
ml	Add experimental MLX backend and engine with imagegen support (#13648 )	2026-01-08 16:18:59 -08:00
model	Add experimental MLX backend and engine with imagegen support (#13648 )	2026-01-08 16:18:59 -08:00
tools	x: request access for all commands, add welcome message (#13662 )	2026-01-09 18:20:39 -08:00
README.md	Add experimental MLX backend and engine with imagegen support (#13648 )	2026-01-08 16:18:59 -08:00

Experimental Features

MLX Backend

We're working on a new experimental backend based on the MLX project

Support is currently limited to MacOS and Linux with CUDA GPUs. We're looking to add support for Windows CUDA soon, and other GPU vendors. To build:

cmake --preset MLX
cmake --build --preset MLX --parallel
cmake --install --component MLX
go build -tags mlx .

On linux, use the preset "MLX CUDA 13" or "MLX CUDA 12" to enable CUDA with the default Ollama NVIDIA GPU architectures enabled.

Based on the experimental MLX backend, we're working on adding imagegen support. After running the cmake commands above:

go build -o imagegen ./x/imagegen/cmd/engine