ollama/server
Patrick Devine d07e4a1dd3
bugfix: better mlx model scheduling (#14290)
This fixes a bug with current MLX based models which don't get loaded/unloaded correctly. The first model currently gets loaded and then subsequent model starts get shunted to the first runner which results in the wrong model being run.
2026-02-17 13:57:05 -08:00
..
internal docs: fix typos in repository documentation (#10683) 2025-11-15 20:22:29 -08:00
aliases.go add ability to disable cloud (#14221) 2026-02-12 15:47:00 -08:00
auth.go server: reject unexpected auth hosts (#13738) 2026-01-16 14:10:36 -05:00
auth_test.go server: reject unexpected auth hosts (#13738) 2026-01-16 14:10:36 -05:00
create.go Clean up the manifest and modelpath (#13807) 2026-01-21 11:46:17 -08:00
create_test.go Clean up the manifest and modelpath (#13807) 2026-01-21 11:46:17 -08:00
download.go Clean up the manifest and modelpath (#13807) 2026-01-21 11:46:17 -08:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go x/imagegen: add image edit capabilities (#13846) 2026-01-22 20:35:08 -08:00
images_test.go x/imagegen: add image edit capabilities (#13846) 2026-01-22 20:35:08 -08:00
logprob.go logprob: add bytes to logprobs (#13068) 2025-11-13 13:49:25 -08:00
model.go Clean up the manifest and modelpath (#13807) 2026-01-21 11:46:17 -08:00
prompt.go server: optimize chatPrompt to reduce tokenization calls (#14040) 2026-02-04 01:21:31 -08:00
prompt_test.go server: optimize chatPrompt to reduce tokenization calls (#14040) 2026-02-04 01:21:31 -08:00
quantization.go model: add qwen3-next architecture (#14051) 2026-02-03 23:27:21 -08:00
quantization_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
routes.go mlxrunner fixes (#14247) 2026-02-13 22:30:42 -08:00
routes_aliases.go cmd: ollama launch improvements (#14099) 2026-02-05 15:08:17 -08:00
routes_aliases_test.go add ability to disable cloud (#14221) 2026-02-12 15:47:00 -08:00
routes_cloud_test.go add ability to disable cloud (#14221) 2026-02-12 15:47:00 -08:00
routes_create_test.go Clean up the manifest and modelpath (#13807) 2026-01-21 11:46:17 -08:00
routes_debug_test.go server: use tiered VRAM-based default context length 2026-02-02 10:47:09 -08:00
routes_delete_test.go Clean up the manifest and modelpath (#13807) 2026-01-21 11:46:17 -08:00
routes_generate_renderer_test.go server: use tiered VRAM-based default context length 2026-02-02 10:47:09 -08:00
routes_generate_test.go bugfix: better mlx model scheduling (#14290) 2026-02-17 13:57:05 -08:00
routes_harmony_streaming_test.go preserve tool definition and call JSON ordering (#13525) 2026-01-05 18:03:36 -08:00
routes_list_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_options_test.go server: use tiered VRAM-based default context length 2026-02-02 10:47:09 -08:00
routes_test.go server: return error when embedding contains NaN or Inf values (#13599) 2026-01-03 02:20:12 -05:00
sched.go bugfix: better mlx model scheduling (#14290) 2026-02-17 13:57:05 -08:00
sched_test.go bugfix: better mlx model scheduling (#14290) 2026-02-17 13:57:05 -08:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
test_home_test.go add ability to disable cloud (#14221) 2026-02-12 15:47:00 -08:00
upload.go Clean up the manifest and modelpath (#13807) 2026-01-21 11:46:17 -08:00