ollama/x/imagegen/mlx
Daniel Hiltgen 10e51c5177
MLX: add header vendoring and remove go build tag (#14642)
* prefer rocm v6 on windows

Avoid building with v7 - more changes are needed

* MLX: add header vendoring and remove go build tag

This switches to using a vendoring approach for the mlx-c headers so that Go
can build without requiring a cmake first.  This enables building the new MLX
based code by default.  Every time cmake runs, the headers are refreshed, so we
can easily keep them in sync when we bump mlx versions.  Basic Windows
and Linux support are verified.

* ci: harden for flaky choco repo servers

CI sometimes fails due to choco not actually installing cache.  Since it just speeds up the build, we can proceed without.

* review comments
2026-03-09 17:24:45 -07:00
..
CMakeLists.txt MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
compile.go MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
doc.go MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
generate_wrappers.go MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
mlx.c MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
mlx.go MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
mlx.h update mlx-c bindings to 0.5.0 (#14380) 2026-02-23 16:44:29 -08:00
mlx_dynamic.c MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
mlx_dynamic.h MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
mlx_test.go MLX: add header vendoring and remove go build tag (#14642) 2026-03-09 17:24:45 -07:00
README.md Add experimental MLX backend and engine with imagegen support (#13648) 2026-01-08 16:18:59 -08:00

MLX Memory Management

| This package will get consolidated with x/ml/backend/mlx in the future.

Automatic Tracking

All arrays are automatically tracked when created. On Eval(), non-kept arrays are freed.

API

result := mlx.Matmul(x, w) // arrays automatically tracked
mlx.Eval(result)           // free non-kept, eval result (auto-kept)

Key Functions

  • mlx.Eval(outputs...) - free non-kept arrays, then evaluate (outputs auto-kept)
  • mlx.AsyncEval(outputs...) - async version of Eval (outputs auto-kept)
  • mlx.Keep(arrays...) - mark arrays to survive cleanup (for weights, caches)
  • array.Free() - mark array for cleanup on next Eval

Loop Pattern

for step := 0; step < maxTokens; step++ {
    logits := model.Forward(token, caches)
    oldToken := token
    token = sample(logits)

    // Keep cache state across iterations
    for _, c := range caches {
        mlx.Keep(c.State()...)
    }

    oldToken.Free()       // mark for cleanup
    mlx.AsyncEval(token)  // frees old, evals new
}

Notes

  • Eval() and AsyncEval() auto-keep their outputs
  • Free() marks for cleanup - actual free happens during next Eval
  • Use Keep() for weights and cache state that must survive multiple Eval cycles
  • Arrays created inside compiled closures are managed by MLX, not tracked