- More emphasis on `api` package. It now holds database model structs
from `lmcli/models` (which is now gone) as well as the tool spec,
call, and result types. `tools.Tool` is now `api.ToolSpec`.
`api.ChatCompletionClient` was renamed to
`api.ChatCompletionProvider`.
- Change ChatCompletion interface and implementations to no longer do
automatic tool call recursion - they simply return a ToolCall message
which the caller can decide what to do with (e.g. prompt for user
confirmation before executing)
- `api.ChatCompletionProvider` functions have had their ReplyCallback
parameter removed, as now they only return a single reply.
- Added a top-level `agent` package, moved the current built-in tools
implementations under `agent/toolbox`. `tools.ExecuteToolCalls` is now
`agent.ExecuteToolCalls`.
- Fixed request context handling in openai, google, ollama (use
`NewRequestWithContext`), cleaned up request cancellation in TUI
- Fix tool call tui persistence bug (we were skipping message with empty
content)
- Now handle tool calling from TUI layer
TODO:
- Prompt users before executing tool calls
- Automatically send tool results to the model (or make this toggleable)
Also put the provider after the model (e.g. `gpt-4o@openai`,
`openai/gpt-4o@openrouter`. The aim here was to reduce clashing and
confusion with existing model naming conventions.
And calculate the tokens/chunk for gemini responses, fixing the tok/s
meter for gemini models.
Further, only consider the first candidate of streamed gemini responses.
Instead of a value, which lead some odd handling of conversation
references.
Also fixed some formatting and removed an unnecessary (and probably
broken) setting of ConversationID in a call to
`cmdutil.HandleConversationReply`
We were sending an empty string to the output channel when `ping`
messages were received from Anthropic's API. This was causing the TUI to
break since we started doing an empty chunk check (and mistakenly not
waiting for future chunks if one was received).
This commit makes it so we no longer an empty string on the ping message
from Anthropic, and we update the handling of msgAssistantChunk and
msgAssistantReply to make it less likely that we forget to wait for the
next chunk/reply.
Updated the behaviour of commands:
- `lmcli edit`
- by default create a new branch/message branch with the edited contents
- add --in-place to avoid creating a branch
- no longer delete messages after the edited message
- only do the edit, don't fetch a new response
- `lmcli retry`
- create a new branch rather than replacing old messages
- add --offset to change where to retry from