An agent is currently a name given to a system prompt and a set of
tools which the agent has access to.
This resolves the previous issue of the set of configured tools being
available in *all* contexts, which wasn't always desired. Tools are now
only available when an agent is explicitly requested using the
`-a/--agent` flag.
Agents are expected to be expanded on: the concept of task-specilized
agents (e.g. coding), the ability to define a set of files an agent
should always have access to for RAG purposes, etc.
Other changes:
- Removes the "tools" top-level config structure (though this is expected
to come back along with the abillity to define custom tools).
- Renamed `pkg/agent` to `pkg/agents`
- Stop using pointers where unnecessary
- Removed default system prompt
- Set indent level to 2 when writing config
- Update ordering of config struct, which affects marshalling
- Make provider `name` optional, defaulting to the provider's `kind`
- More emphasis on `api` package. It now holds database model structs
from `lmcli/models` (which is now gone) as well as the tool spec,
call, and result types. `tools.Tool` is now `api.ToolSpec`.
`api.ChatCompletionClient` was renamed to
`api.ChatCompletionProvider`.
- Change ChatCompletion interface and implementations to no longer do
automatic tool call recursion - they simply return a ToolCall message
which the caller can decide what to do with (e.g. prompt for user
confirmation before executing)
- `api.ChatCompletionProvider` functions have had their ReplyCallback
parameter removed, as now they only return a single reply.
- Added a top-level `agent` package, moved the current built-in tools
implementations under `agent/toolbox`. `tools.ExecuteToolCalls` is now
`agent.ExecuteToolCalls`.
- Fixed request context handling in openai, google, ollama (use
`NewRequestWithContext`), cleaned up request cancellation in TUI
- Fix tool call tui persistence bug (we were skipping message with empty
content)
- Now handle tool calling from TUI layer
TODO:
- Prompt users before executing tool calls
- Automatically send tool results to the model (or make this toggleable)
Also put the provider after the model (e.g. `gpt-4o@openai`,
`openai/gpt-4o@openrouter`. The aim here was to reduce clashing and
confusion with existing model naming conventions.
Instead of a value, which lead some odd handling of conversation
references.
Also fixed some formatting and removed an unnecessary (and probably
broken) setting of ConversationID in a call to
`cmdutil.HandleConversationReply`
We were sending an empty string to the output channel when `ping`
messages were received from Anthropic's API. This was causing the TUI to
break since we started doing an empty chunk check (and mistakenly not
waiting for future chunks if one was received).
This commit makes it so we no longer an empty string on the ping message
from Anthropic, and we update the handling of msgAssistantChunk and
msgAssistantReply to make it less likely that we forget to wait for the
next chunk/reply.
Updated the behaviour of commands:
- `lmcli edit`
- by default create a new branch/message branch with the edited contents
- add --in-place to avoid creating a branch
- no longer delete messages after the edited message
- only do the edit, don't fetch a new response
- `lmcli retry`
- create a new branch rather than replacing old messages
- add --offset to change where to retry from
When the last message in the passed messages slice is an assistant
message, treat it as a partial message that is being continued, and
include its content in the newly created reply
Update TUI code to handle new behavior
Strip the function call XML from the returned/saved content, which
should allow for model switching between openai/anthropic (and
others?) within the same conversation involving tool calls.
This involves reconstructing the function call XML when sending requests
to anthropic
Instead of CreateChatCompletion* accepting a pointer to a slice of reply
messages, it accepts a callback which is called with each successive
reply the conversation.
This gives the caller more flexibility in how it handles replies (e.g.
it can react to them immediately now, instead of waiting for the entire
call to finish)