mirror of
https://github.com/ollama/ollama
synced 2026-04-23 08:45:14 +00:00
Converts SiLU/GELUApprox to compiled kernels and adds SwiGLU, matching upstream mlx/mlx_lm's activations pattern. Routes llama, qwen3, qwen3_5 (dense + MoE), and glm4_moe_lite MLP paths through mlx.SwiGLU so each MLP invocation runs as one fused Metal/CUDA kernel rather than a chain of per-op launches. |
||
|---|---|---|
| .. | ||
| qwen3.go | ||