mirror of
https://github.com/ollama/ollama
synced 2026-05-05 22:54:33 +00:00
This change adds a new MLX based runner which includes: * Method-based MLX bindings * Subprocess-based MLX runner (x/mlxrunner) * KV cache with tree management * A basic sampler The GLM4-MoE-Lite model has been ported to use the new bindings. --------- Co-authored-by: Michael Yang <git@mxy.ng> |
||
|---|---|---|
| .. | ||
| cache.go | ||