mirror of
https://github.com/ollama/ollama
synced 2026-04-23 08:45:14 +00:00
GGML picks the wrong kernel and these systems fail with: Sep 28 22:25:39 xavier ollama[48999]: //ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu:437: ERROR: CUDA kernel flash_attn_ext_f16 has no device code compatible with CUDA arch 720. ggml-cuda.cu was compiled for: __CUDA_ARCH_LIST__ Fixes #12442 |
||
|---|---|---|
| .. | ||
| cpu_linux.go | ||
| cpu_linux_test.go | ||
| cpu_windows.go | ||
| cpu_windows_test.go | ||
| gpu.go | ||
| gpu_darwin.go | ||
| gpu_info_darwin.h | ||
| gpu_info_darwin.m | ||
| path.go | ||
| runner.go | ||
| runner_test.go | ||
| types.go | ||