mirror of
https://github.com/ollama/ollama
synced 2026-04-23 08:45:14 +00:00
* tokenizer: add byte fallback for SentencePiece BPE encoding When BPE merging produces tokens not in the vocabulary, fall back to encoding each UTF-8 byte as <0xHH> byte tokens instead of silently dropping the character. Also teach Decode to convert <0xHH> tokens back to raw bytes. Fixes #15229, fixes #15231 * tokenizer fixes |
||
|---|---|---|
| .. | ||
| bert | ||
| deepseek2 | ||
| deepseekocr | ||
| gemma2 | ||
| gemma3 | ||
| gemma3n | ||
| gemma4 | ||
| glm4moelite | ||
| glmocr | ||
| gptoss | ||
| lfm2 | ||
| llama | ||
| llama4 | ||
| mistral3 | ||
| mllama | ||
| nemotronh | ||
| nomicbert | ||
| olmo3 | ||
| qwen2 | ||
| qwen3 | ||
| qwen3next | ||
| qwen3vl | ||
| qwen25vl | ||
| models.go | ||