ollama/x/mlxrunner
Daniel Hiltgen 06ae6367bd
mlx: fix RotatingKVCache.concat() dropping context on mid-rotation (#15591)
After the rotating buffer has wrapped (c.offset > c.maxSize) a subsequent
L>1 Update() went through a slice-to-[0, c.idx) path that discarded all
slots in [c.idx, Dim), losing the older-but-still-in-window tokens the
first Q of the new batch needs for its sliding-window attention.

Linearize the circular buffer to logical order in that wrapped case so
the existing trim + concat preserves the last (maxSize - 1) old tokens.
When the buffer has not yet wrapped (c.offset <= c.maxSize), slots
[c.idx, Dim) are grow padding or stale post-rewind data, so keep
dropping them.
2026-04-14 18:29:06 -07:00
..
cache mlx: fix RotatingKVCache.concat() dropping context on mid-rotation (#15591) 2026-04-14 18:29:06 -07:00
mlx mlx: Improve gemma4 performance with fused operations (#15587) 2026-04-14 18:04:04 -07:00
model mlx: mixed-precision quant and capability detection improvements (#15409) 2026-04-13 11:43:07 -07:00
sample mlxrunner: fix Slice(0, 0) returning full dimension instead of empty 2026-03-18 16:06:33 -07:00
cache.go mlxrunner: schedule periodic snapshots during prefill 2026-03-26 13:32:11 -07:00
cache_test.go mlxrunner: schedule periodic snapshots during prefill 2026-03-26 13:32:11 -07:00
cache_trie.go mlxrunner: share KV cache across conversations with common prefixes 2026-03-18 16:06:33 -07:00
cache_trie_test.go mlxrunner: share KV cache across conversations with common prefixes 2026-03-18 16:06:33 -07:00
client.go mlx: use default http client (#15405) 2026-04-07 14:53:23 -07:00
imports.go Gemma4 on MLX (#15244) 2026-04-13 16:36:51 -07:00
pipeline.go mlx: add compiled closure support 2026-04-14 16:38:32 -07:00
runner.go mlx: add compiled closure support 2026-04-14 16:38:32 -07:00
server.go mlx: quantized embeddings, fast SwiGLU, and runtime fixes (#14884) 2026-03-17 11:21:38 -07:00
utf8_buffer.go consolidate the tokenizer (#14327) 2026-02-19 15:55:45 -08:00
utf8_buffer_test.go consolidate the tokenizer (#14327) 2026-02-19 15:55:45 -08:00