ollama

mirror of https://github.com/ollama/ollama synced 2026-04-23 08:45:14 +00:00

History

Jesse Gross bbbad97686 sched: Model eviction for MLX MLX runners (image generation and LLM) previously bypassed the scheduler's standard load path via a separate loadMLX method. This meant they skipped VRAM fitting checks and couldn't participate in model eviction. Now all model types flow through the same load function. Model eviction for MLX is based on weights as KV cache and compute graph are dynamic. This means that eviction does not take into account the worst case memory and models can still compete for memory but it is a significant improvement.		2026-03-16 17:40:29 -07:00
..
cache	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
mlx	mlx: perf improvements (#14768 )	2026-03-12 12:01:28 -07:00
model	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
sample	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
cache.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
client.go	sched: Model eviction for MLX	2026-03-16 17:40:29 -07:00
imports.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
pipeline.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
runner.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
server.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
utf8_buffer.go	consolidate the tokenizer (#14327 )	2026-02-19 15:55:45 -08:00
utf8_buffer_test.go	consolidate the tokenizer (#14327 )	2026-02-19 15:55:45 -08:00