ollama/x
Daniel Hiltgen 4d14b0ff92
mlx: respect tokenizer add_bos_token setting in pipeline (#15185)
Replace hardcoded Encode(prompt, true) with
Encode(prompt, r.Tokenizer.AddBOS()) so the pipeline respects each
model's tokenizer configuration.

Models with add_bos_token=true (gemma3, llama): unchanged, tokenizer
still prepends BOS.

Models with bos_token=null (qwen3, qwen3.5): unchanged, the BOS
guard (vocab.BOS >= 0) already prevented prepending regardless of
the flag.

This aligns the pipeline with the /v1/tokenize endpoint which already
uses Tokenizer.AddBOS().
2026-03-31 16:46:30 -07:00
..
agent x/cmd: enable web search and web fetch with flag (#13690) 2026-01-12 13:59:40 -08:00
cmd Reapply "don't require pulling stubs for cloud models" again (#14608) 2026-03-06 14:27:47 -08:00
create mlx: fix vision capability + min version (#15106) 2026-03-27 17:09:28 -07:00
imagegen ci: fix windows cgo compiler error (#15046) 2026-03-24 16:45:36 -07:00
mlxrunner mlx: respect tokenizer add_bos_token setting in pipeline (#15185) 2026-03-31 16:46:30 -07:00
models mlx: add mxfp4/mxfp8/nvfp4 importing (#15015) 2026-03-24 13:45:44 -07:00
server mlx: fix vision capability + min version (#15106) 2026-03-27 17:09:28 -07:00
tokenizer mlx: quantized embeddings, fast SwiGLU, and runtime fixes (#14884) 2026-03-17 11:21:38 -07:00
tools add ability to disable cloud (#14221) 2026-02-12 15:47:00 -08:00