ollama

mirror of https://github.com/ollama/ollama synced 2026-04-23 08:45:14 +00:00

History

Jesse Gross aa45f7ce27 discover: Disable flash attention for Jetson Xavier (CC 7.2) GGML picks the wrong kernel and these systems fail with: Sep 28 22:25:39 xavier ollama[48999]: //ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu:437: ERROR: CUDA kernel flash_attn_ext_f16 has no device code compatible with CUDA arch 720. ggml-cuda.cu was compiled for: __CUDA_ARCH_LIST__ Fixes #12442		2025-10-08 09:56:15 -07:00
..
cpu_linux.go	Use runners for GPU discovery (#12090 )	2025-10-01 15:12:32 -07:00
cpu_linux_test.go	Use runners for GPU discovery (#12090 )	2025-10-01 15:12:32 -07:00
cpu_windows.go	Use runners for GPU discovery (#12090 )	2025-10-01 15:12:32 -07:00
cpu_windows_test.go	Use runners for GPU discovery (#12090 )	2025-10-01 15:12:32 -07:00
gpu.go	discover: Disable flash attention for Jetson Xavier (CC 7.2)	2025-10-08 09:56:15 -07:00
gpu_darwin.go	Use runners for GPU discovery (#12090 )	2025-10-01 15:12:32 -07:00
gpu_info_darwin.h	Rename gpu package discover (#7143 )	2024-10-16 17:45:00 -07:00
gpu_info_darwin.m	Rename gpu package discover (#7143 )	2024-10-16 17:45:00 -07:00
path.go	Re-remove cuda v11 (#10694 )	2025-06-23 14:07:00 -07:00
runner.go	Bring back escape valve for llm libraries and fix Jetpack6 crash (#12529 )	2025-10-07 16:06:14 -07:00
runner_test.go	Use runners for GPU discovery (#12090 )	2025-10-01 15:12:32 -07:00
types.go	discover: Disable flash attention for Jetson Xavier (CC 7.2)	2025-10-08 09:56:15 -07:00