Commit graph

22 commits

Author SHA1 Message Date
Devon Rifkin 626af2d809
template: fix args-as-json rendering (#13636)
In #13525, I accidentally broke templates' ability to automatically
render tool call function arguments as JSON.

We do need these to be proper maps because we need templates to be able
to call range, which can't be done on custom types.
2026-01-06 18:33:57 -08:00
Jeffrey Morgan 48e78e9be1
template: add yesterdayDate helper function (#13431) 2025-12-11 14:47:55 -08:00
Michael Yang 718961de68
migrate to golangci-lint v2 (#13109)
* migrate to golangci-lint v2
* copyloopvar
2025-11-18 11:00:26 -08:00
Patrick Devine 2fa1e92a99
test: add template error test (#12489) 2025-10-03 12:05:34 -07:00
Patrick Devine 1ed2881ef0
templates: fix crash in improperly defined templates (#12483) 2025-10-02 17:25:55 -07:00
Parth Sareen 1f91cb0c8c
template: add tool result compatibility (#11294) 2025-07-07 15:53:42 -07:00
Michael Yang 58245413f4
next ollama runner (#7913)
feat: add new Ollama engine using ggml through cgo

This change introduces a new way to run pretrained models. It introduces 3 high level interfaces and a bunch of smaller helper interfaces to facilitate this.

- `model.Model` defines the interface for a model architecture. Models such as `llama` and `mllama`, which are provided as examples, can implement the model's forward propagation in the `Forward` method. This method will be called to generate completions. This interface can be found in `model/model.go`
- `ml.Backend` defines the interface for a backend tensor library, in this case `ggml`. Among other things, a Backend is responsible for loading a pretrained model into hardware (GPU, CPU, etc) and providing an interface for Models to access loaded tensors. This interface can be found in `ml/backend.go`
- `ml.Tensor` defines the interface for a tensor and tensor operations

This is the first implementation of the new engine. Follow up PRs will implement more features:

- non-greedy sampling (#8410)
- integration with Ollama and KV caching (#8301)
- more model support (#9080) with more coming soon

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2025-02-13 16:31:21 -08:00
Patrick Devine c7cb0f0602
image processing for llama3.2 (#6963)
Co-authored-by: jmorganca <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Jesse Gross <jesse@ollama.com>
2024-10-18 16:12:35 -07:00
Michael Yang b732beba6a lint 2024-08-01 17:06:06 -07:00
Jeffrey Morgan 20090f3172
preserve last assistant message (#5802) 2024-07-19 20:19:26 -07:00
Michael Yang d290e87513 add suffix support to generate endpoint
this change is triggered by the presence of "suffix", particularly
useful for code completion tasks
2024-07-16 14:31:35 -07:00
Michael Yang 36c87c433b template: preprocess message and collect system 2024-07-12 12:26:43 -07:00
Michael Yang 5056bb9c01 rename aggregate to contents 2024-07-11 17:00:26 -07:00
Michael Yang 57ec6901eb revert embedded templates to use prompt/response
This reverts commit 19753c18c0.

for compat. messages will be added at a later date
2024-07-11 14:49:35 -07:00
Michael Yang e64f9ebb44 do no automatically aggregate system messages 2024-07-11 14:49:35 -07:00
Michael Yang 41be28096a add system prompt to first legacy template 2024-07-10 17:03:08 -07:00
Michael Yang fb6cbc02fb update named templates 2024-07-05 16:29:32 -07:00
Michael Yang 326363b3a7 no funcs 2024-07-05 13:17:25 -07:00
Michael Yang 2c3fe1fd97 comments 2024-07-05 13:17:24 -07:00
Michael Yang 269ed6e6a2 update message processing 2024-07-05 13:16:58 -07:00
Michael Yang a30915bde1 add capabilities 2024-07-01 10:47:43 -07:00
Michael Yang 58e3fff311 rename templates to template 2024-07-01 10:40:54 -07:00