ollama

mirror of https://github.com/ollama/ollama synced 2026-04-23 08:45:14 +00:00

History

Daniel Hiltgen de9673ac3f tokenizer: add byte fallback for SentencePiece BPE encoding (#15232 ) * tokenizer: add byte fallback for SentencePiece BPE encoding When BPE merging produces tokens not in the vocabulary, fall back to encoding each UTF-8 byte as <0xHH> byte tokens instead of silently dropping the character. Also teach Decode to convert <0xHH> tokens back to raw bytes. Fixes #15229, fixes #15231 * tokenizer fixes		2026-04-02 13:04:45 -07:00
..
model.go	Add support for gemma4 (#15214 )	2026-04-02 11:33:33 -07:00
model_audio.go	Add support for gemma4 (#15214 )	2026-04-02 11:33:33 -07:00
model_text.go	Add support for gemma4 (#15214 )	2026-04-02 11:33:33 -07:00
model_vision.go	Add support for gemma4 (#15214 )	2026-04-02 11:33:33 -07:00
process_audio.go	Add support for gemma4 (#15214 )	2026-04-02 11:33:33 -07:00
process_image.go	Add support for gemma4 (#15214 )	2026-04-02 11:33:33 -07:00
tokenizer_compare_test.go	Add support for gemma4 (#15214 )	2026-04-02 11:33:33 -07:00
tokenizer_reference_test.go	tokenizer: add byte fallback for SentencePiece BPE encoding (#15232 )	2026-04-02 13:04:45 -07:00