ollama

mirror of https://github.com/ollama/ollama synced 2026-04-23 08:45:14 +00:00

History

Jesse Gross 24e038d56a mlxrunner: add logprobs support Match the ollamarunner and OpenAI semantics: raw, full-vocab log-softmax with the top-K ranked by probability. Skipped on the GPU when the request doesn't ask for logprobs so decode doesn't pay for it otherwise.		2026-04-20 17:43:00 -07:00
..
cache	mlx: additional gemma4 cache fixes (#15607 )	2026-04-16 13:07:19 -07:00
mlx	mlxrunner: add logprobs support	2026-04-20 17:43:00 -07:00
model	mlx: mixed-precision quant and capability detection improvements (#15409 )	2026-04-13 11:43:07 -07:00
sample	mlxrunner: add logprobs support	2026-04-20 17:43:00 -07:00
cache.go	mlxrunner: schedule periodic snapshots during prefill	2026-03-26 13:32:11 -07:00
cache_test.go	mlxrunner: schedule periodic snapshots during prefill	2026-03-26 13:32:11 -07:00
cache_trie.go	mlxrunner: share KV cache across conversations with common prefixes	2026-03-18 16:06:33 -07:00
cache_trie_test.go	mlxrunner: share KV cache across conversations with common prefixes	2026-03-18 16:06:33 -07:00
client.go	mlxrunner: add logprobs support	2026-04-20 17:43:00 -07:00
imports.go	Gemma4 on MLX (#15244 )	2026-04-13 16:36:51 -07:00
pipeline.go	mlxrunner: add logprobs support	2026-04-20 17:43:00 -07:00
runner.go	mlxrunner: add logprobs support	2026-04-20 17:43:00 -07:00
server.go	mlxrunner: add logprobs support	2026-04-20 17:43:00 -07:00
utf8_buffer.go	consolidate the tokenizer (#14327 )	2026-02-19 15:55:45 -08:00
utf8_buffer_test.go	consolidate the tokenizer (#14327 )	2026-02-19 15:55:45 -08:00