ollama

mirror of https://github.com/ollama/ollama synced 2026-04-23 08:45:14 +00:00

History

Jesse Gross bbbad97686 sched: Model eviction for MLX MLX runners (image generation and LLM) previously bypassed the scheduler's standard load path via a separate loadMLX method. This meant they skipped VRAM fitting checks and couldn't participate in model eviction. Now all model types flow through the same load function. Model eviction for MLX is based on weights as KV cache and compute graph are dynamic. This means that eviction does not take into account the worst case memory and models can still compete for memory but it is a significant improvement.		2026-03-16 17:40:29 -07:00
..
agent	x/cmd: enable web search and web fetch with flag (#13690 )	2026-01-12 13:59:40 -08:00
cmd	Reapply "don't require pulling stubs for cloud models" again (#14608 )	2026-03-06 14:27:47 -08:00
create	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
imagegen	sched: Model eviction for MLX	2026-03-16 17:40:29 -07:00
mlxrunner	sched: Model eviction for MLX	2026-03-16 17:40:29 -07:00
models	mlx: perf improvements (#14768 )	2026-03-12 12:01:28 -07:00
server	bugfix: display the parameter count correctly in mlx for ollama show (#14285 )	2026-02-16 13:03:34 -08:00
tokenizer	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
tools	add ability to disable cloud (#14221 )	2026-02-12 15:47:00 -08:00