This refactor splits out all conversation concerns into a new
`conversation` package. There is now a split between `conversation` and
`api`s representation of `Message`, the latter storing the minimum
information required for interaction with LLM providers. There is
necessary conversation between the two when making LLM calls.
- More emphasis on `api` package. It now holds database model structs
from `lmcli/models` (which is now gone) as well as the tool spec,
call, and result types. `tools.Tool` is now `api.ToolSpec`.
`api.ChatCompletionClient` was renamed to
`api.ChatCompletionProvider`.
- Change ChatCompletion interface and implementations to no longer do
automatic tool call recursion - they simply return a ToolCall message
which the caller can decide what to do with (e.g. prompt for user
confirmation before executing)
- `api.ChatCompletionProvider` functions have had their ReplyCallback
parameter removed, as now they only return a single reply.
- Added a top-level `agent` package, moved the current built-in tools
implementations under `agent/toolbox`. `tools.ExecuteToolCalls` is now
`agent.ExecuteToolCalls`.
- Fixed request context handling in openai, google, ollama (use
`NewRequestWithContext`), cleaned up request cancellation in TUI
- Fix tool call tui persistence bug (we were skipping message with empty
content)
- Now handle tool calling from TUI layer
TODO:
- Prompt users before executing tool calls
- Automatically send tool results to the model (or make this toggleable)
And calculate the tokens/chunk for gemini responses, fixing the tok/s
meter for gemini models.
Further, only consider the first candidate of streamed gemini responses.