mirror of https://github.com/ollama/ollama synced 2026-04-23 08:45:14 +00:00

History

Daniel Hiltgen 10e51c5177 MLX: add header vendoring and remove go build tag (#14642 ) * prefer rocm v6 on windows Avoid building with v7 - more changes are needed * MLX: add header vendoring and remove go build tag This switches to using a vendoring approach for the mlx-c headers so that Go can build without requiring a cmake first. This enables building the new MLX based code by default. Every time cmake runs, the headers are refreshed, so we can easily keep them in sync when we bump mlx versions. Basic Windows and Linux support are verified. * ci: harden for flaky choco repo servers CI sometimes fails due to choco not actually installing cache. Since it just speeds up the build, we can proceed without. * review comments		2026-03-09 17:24:45 -07:00
..
CMakeLists.txt	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
compile.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
doc.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
generate_wrappers.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
mlx.c	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
mlx.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
mlx.h	update mlx-c bindings to 0.5.0 (#14380 )	2026-02-23 16:44:29 -08:00
mlx_dynamic.c	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
mlx_dynamic.h	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
mlx_test.go	MLX: add header vendoring and remove go build tag (#14642 )	2026-03-09 17:24:45 -07:00
README.md	Add experimental MLX backend and engine with imagegen support (#13648 )	2026-01-08 16:18:59 -08:00

README.md

MLX Memory Management

| This package will get consolidated with x/ml/backend/mlx in the future.

Automatic Tracking

All arrays are automatically tracked when created. On Eval(), non-kept arrays are freed.

API

result := mlx.Matmul(x, w) // arrays automatically tracked
mlx.Eval(result)           // free non-kept, eval result (auto-kept)

Key Functions

mlx.Eval(outputs...) - free non-kept arrays, then evaluate (outputs auto-kept)
mlx.AsyncEval(outputs...) - async version of Eval (outputs auto-kept)
mlx.Keep(arrays...) - mark arrays to survive cleanup (for weights, caches)
array.Free() - mark array for cleanup on next Eval

Loop Pattern

for step := 0; step < maxTokens; step++ {
    logits := model.Forward(token, caches)
    oldToken := token
    token = sample(logits)

    // Keep cache state across iterations
    for _, c := range caches {
        mlx.Keep(c.State()...)
    }

    oldToken.Free()       // mark for cleanup
    mlx.AsyncEval(token)  // frees old, evals new
}

Notes

Eval() and AsyncEval() auto-keep their outputs
Free() marks for cleanup - actual free happens during next Eval
Use Keep() for weights and cache state that must survive multiple Eval cycles
Arrays created inside compiled closures are managed by MLX, not tracked