ollama

mirror of https://github.com/ollama/ollama synced 2026-04-23 08:45:14 +00:00

History

Jesse Gross e1e3cec8d0 models: fuse MLP activation functions via mlx_compile Converts SiLU/GELUApprox to compiled kernels and adds SwiGLU, matching upstream mlx/mlx_lm's activations pattern. Routes llama, qwen3, qwen3_5 (dense + MoE), and glm4_moe_lite MLP paths through mlx.SwiGLU so each MLP invocation runs as one fused Metal/CUDA kernel rather than a chain of per-op launches.	2026-04-14 16:38:32 -07:00
..
qwen3.go	models: fuse MLP activation functions via mlx_compile	2026-04-14 16:38:32 -07:00

Jesse Gross e1e3cec8d0 models: fuse MLP activation functions via mlx_compile

Converts SiLU/GELUApprox to compiled kernels and adds SwiGLU,
matching upstream mlx/mlx_lm's activations pattern. Routes llama,
qwen3, qwen3_5 (dense + MoE), and glm4_moe_lite MLP paths through
mlx.SwiGLU so each MLP invocation runs as one fused Metal/CUDA
kernel rather than a chain of per-op launches.

2026-04-14 16:38:32 -07:00

qwen3.go

models: fuse MLP activation functions via mlx_compile

2026-04-14 16:38:32 -07:00