* gptel.el (gptel--url-get-response): If the backend-url is a
function, call it to find the full url to query.
* gptel-gemini.el: Gemini uses different urls for
streaming/oneshot responses. Set the backend-url to a function to
account for the value of gptel-stream. This is also safer than
before as the API key is not stored as part of a static url string
in memory. Fix#153.
* gptel-curl.el (gptel-curl--get-args): If the backend-url is a
function, call it to find the full url to query.
README: Mention `gptel-update-destination` in README.
gptel.el (gptel-update-destination, gptel--update-status,
gptel-send, gptel--insert-response): New option
`gptel-update-destination` to control how gptel's status messages
are shown. `gptel--update-status` replaces
`gptel--update-header-line`. Replace calls to this function
elsewhere in gptel.el.
gptel-curl.el (gptel-abort, gptel-curl--stream-cleanup,
gptel-curl--stream-insert-response): Use `gptel--update-status` in
place of `gptel--update-header-line`.
gptel-transient.el (gptel--suffix-send): Use
`gptel--update-status` in place of `gptel--update-header-line`.
* gptel-curl.el (gptel-curl--sentinel, gptel-curl--stream-filter):
Remove redundant calls to `gptel-curl--stream-insert-response`
when the response being inserted is nil or a blank string. This
should be a modest boost to streaming performance.
* gptel.el (gptel-auto-scroll, gptel-end-of-response,
gptel-post-response-hook, gptel-post-stream-hook): Add
`gptel-post-stream-hook` that runs after each text insertion when
streaming responses. This can be used to, for instance,
auto-scroll the window as the response continues below the
viewport. The utility function `gptel-auto-scroll` does this.
Provide a utility command `gptel-end-of-response`, which moves the
cursor to the end of the response when it is in or before it.
* gptel-curl.el (gptel-curl--stream-insert-response): Run
`gptel-post-stream-hook` where required.
* README: Add FAQ, simplify structure, mention the new hooks and
scrolling/navigation options.
* gptel-curl.el (gptel-curl--common-args): Following the
discussion in #143, Use "-y300 -Y1" as Curl arguments instead of
specifying the timeout. Now the connection stays open unless less
than 1 byte of information is exchanged over 300 seconds.
gptel: Add customizable prompt/response prefixes
gptel.el (gptel-prompt-prefix-alist, gptel-response-prefix-alist,
gptel-prompt-prefix-string, gptel-response-prefix-string,
gptel--url-get-response): Add customizable response prefixes (per
major-mode) in `gptel-response-prefix-alist`.
Rename `gptel-prompt-string` -> `gptel-prompt-prefix-string`
The function `gptel-response-prefix-string` returns the prefix
string for the response in the current major-mode.
gptel-openai.el, gptel-ollama.el (gptel--parse-buffer): Remove the
prompt and response prefixes when creating prompt strings to send
to the LLM API.
gptel-curl.el (gptel-curl--stream-cleanup,
gptel-curl--stream-insert-response): Insert the response prefix
for the current major-mode before inserting the LLM API response.
gptel-curl.el (gptel-curl--get-args,
gptel-curl-file-size-threshold): Use temporary file for curl data.
Ensure curl uses a temporary file for binary data to prevent
issues with large payloads and special characters:
- Add a new defcustom `gptel-curl-file-size-threshold` to
determine when to use a temporary file for passing data to Curl.
- Use `--data-binary` with a temp file for data larger than the
specified threshold, improving handling of large data payloads in
GPTel queries.
- Reliably clean up temporary files created for Curl requests
exceeding the size threshold. Add a function to
`gptel-post-response-hook` to delete the file post-Curl execution
and remove itself from the hook, preventing temporary file
accumulation.
gptel-curl.el (gptel-curl--common-args, gptel-curl--get-args):
Don't use compression with Curl on Windows, since it seems to
be generally not supported. Fix#90.
* gptel.el (gptel--url-get-response, gptel--url-parse-response):
- When the query fails, the error message format (in the JSON)
differs between APIs. Ultimately it may be required to dispatch
error handling via a generic function, but for now: try to make
the error handling API agnostic.
- Mention the backend name in the error message. Pass the backend
to the (non-streaming response) parsers to be able to do this.
* gptel-curl.el (gptel-curl--stream-cleanup,
gptel-curl--parse-response): Same changes.
gptel-curl.arg (gptel-curl--get-args): Increase curl timeout.
Often local LLMs will offload a query to CPU if there is not enough VRAM or in
the case of an unsupported GPU. When a query is offloaded to the CPU responses
can be significantly slower. If curl times out early the user will not get the
response from the LLM back in Emacs.
This change increases the timeout for curl from 60s to 300s to make gptel usable
in slower environments.
Closes#125
README.org: Update README with new information and a multi-llm demo.
gptel.el (gptel-host, gptel--known-backends, gptel--api-key,
gptel--create-prompt, gptel--request-data, gptel--parse-buffer, gptel-request,
gptel--parse-response, gptel--openai, gptel--debug, gptel--restore-state,
gptel, gptel-backend):
Integrate multiple LLMs through the introcution of gptel-backends. Each backend
is composed of two pieces:
1. An instance of a cl-struct, containing connection, authentication and model
information. See the cl-struct `gptel-backend` for details. A separate
cl-struct type is defined for each supported backend (OpenAI, Azure, GPT4All and
Ollama) that inherits from the generic gptel-backend type.
2. cl-generic implementations of specific tasks, like gathering up and
formatting context (previous user queries and LLM responses), parsing responses
or responses streams etc. The four tasks currently specialized this way are
carried out by `gptel--parse-buffer` and `gptel--request-data` (for constructing
the query) and `gptel--parse-response` and `gptel-curl--parse-stream` (for
parsing the response). See their implementations for details. Some effort has
been made to limit the number of times dispatching is done when reading
streaming responses.
When a backend is created, it is registered in the collection
`gptel--known-backends` and can be accessed by name later, such as from the
transient menu.
Only one of these backends is active at any time in a buffer, stored in the
buffer-local variable `gptel-backend`. Most messaging, authentication etc
accounts for the active backend, although there might be some leftovers.
When using `gptel-request` or `gptel-send`, the active backend can be changed or
let-bound.
- Obsolete `gptel-host`
- Fix the rear-sticky property when restoring sessions from files.
- Document some variables (not user options), like `gptel--debug`
gptel-openai.el (gptel-backend, gptel-make-openai, gptel-make-azure,
gptel-make-gpt4all): This file (currently always loaded) sets up the generic
backend struct and includes constructors for creating OpenAI, GPT4All and Azure
backends. They all use the same API so a single set of defgeneric
implemenations suffices for all of them.
gptel-ollama.el (gptel-make-ollama): This file includes the cl-struct,
constructor and requisite defgeneric implementations for Ollama support.
gptel-transient.el (gptel-menu, gptel-provider-variable, gptel--infix-provider,
gptel-suffix-send):
- Provide access to all available LLM backends and models from `gptel-menu`.
- Adjust keybindings in gptel-menu: setting the model and query parameters is
now bound to two char keybinds, while redirecting input and output is bound to
single keys.
gptel.el (gptel--insert-response):
gptel-curl.el (gptel-curl--stream-insert-response): Make the `gptel'
text-property rear-nonsticky so typing after it is recognized as part of the
user prompt.
* gptel.el (gptel--insert-response, gptel-pre-response-hook): New
user option `gptel-pre-response-hook' that runs before the
response is inserted into the buffer. This can be used to prepare
the buffer in some user-specified way for the response.
* gptel-curl.el (gptel-curl--stream-filter): Run
`gptel-pre-response-hook' before inserting streaming responses.
* gptel-curl.el (gptel-curl-get-response): Don't convert response
into org-mode unless the buffer from which the request originated
is in org-mode. This makes `gptel-default-mode' less binding, and
only used when creating a new chat session with `gptel'. Also,
gptel should now do the right thing depending on whether the
current buffer is in text, Markdown or Org modes.
* gptel.el (gptel-proxy): Support a proxy when interacting with openai
endpoint. In many organizations the openai api can only be accessed
via proxy. This is easily supported by curl.
gptel-curl.el (gptel-curl--get-args): tidy up `gptel-curl--get-args'.
---------
Co-authored-by: PalaceChan <XXX>
* gptel-curl.el (gptel-curl--stream-cleanup): `gptel-post-response-hook' should
run in the buffer that was current when the request was sent. This was not the
case for the curl method (with response streaming). Fixed.
* gptel.el (gptel--insert-response):
* gptel-transient.el (gptel--suffix-send):
* gptel-curl.el (gptel-curl--stream-filter, gptel-curl--stream-insert-response,
gptel-curl--stream-cleanup):
Handle read-only gptel buffers by redirecting the output to a new buffer (that
pops up automatically). To track this,
- the `:position' argument of the INFO plist, which is a marker, is moved to the
new output buffer.
- the `:buffer' argument of the INFO plist is unmodified, it always points to
the buffer that the request originated from.
* gptel-curl.el (gptel-curl-get-response): Set buffer-local model parameters in
the correct (i.e. gptel) buffer, not in Curl's process buffer. This fixes#43.
* gptel.el (gptel--insert-response, gptel-request):
- Add an in-place key to gptel-request. When true, the default
callbacks will not delimit the API responses with newlines.
- Add a strea option to gptel-request. Only works with the default
filter/stream-insert callback, so it's marked as for internal use
for now.
* gptel-curl.el (gptel-curl--stream-insert-response): Ditto.
* gptel.el (gptel--url-parse-response, gptel--url-get-response,
gptel--insert-response, gptel-send):
- Use shorter keys for passing the info plist,
- record errors in the info plist,
- separate user messaging from the callback and more.
- Make the API more functional (i.e. less imperative)
This is in preparation for adding `gptel-request', an API for
defining custom commands.
Note: The streaming filter and callback are mostly unchanged.
Streaming is not planned to be accessible via `gptel-request'.
* gptel-curl.el (gptel-curl--parse-response, gptel-curl--sentinel,
gptel-curl--stream-filter, gptel-curl--stream-insert-response,
gptel-curl--stream-cleanup, gptel-curl-get-response): Ditto.
* gptel.el (gptel--url-parse-response, gptel--insert-response):
Use the same error codes/descriptions across url-retrieve/Curl,
with and without streaming responses.
* gptel-curl.el (gptel-curl--parse-response,
gptel-curl--stream-filter, gptel-curl--stream-cleanup): Ditto.
* gptel-curl.el (gptel-curl--stream-filter,
gptel-curl--stream-cleanup): When streaming responses, move the
error handling from the Curl process filter to the cleanup
sentinel. This simplifies the filter code a fair bit.
* gptel.el (gptel--url-get-response, gptel--insert-response,
gptel-send): Rename the :insert-marker keyword in the async info
plist to :start-marker.
* gptel-curl.el (gptel--insert-response-stream,
gptel-curl--cleanup-stream, gptel-curl-get-response): Ditto.
* gptel.el (gptel--convert-playback-markdown->org): New converter
for markdown->org that works on text chunks while maintaining the
parse state until the text stream is finished.
* gptel-curl.el (gptel--insert-response-stream,
gptel-curl-get-response): When using `gptel-playback' and
requesting ChatGPT's responses in org-mode, run the above
converter on the received response. This works by storing the
converter and associated state as a closure in the async info
plist that is supplied along with the response, and running it
repeatedly on each chunk of text in the response stream before it
is inserted into the buffer.
FIXME: Note that `gptel-response-filter-functions' is currently
ignored if using `gptel-stream'.
* gptel.el (gptel--request-data): Request a streaming message if
`gptel-stream' is non-nil.
* gptel-curl.el (gptel-curl-get-response,
gptel-curl--cleanup-stream, gptel-curl--filter): Add a process
filter and sentinel for Curl to stream ChatGPT's response into
Emacs in real-time.
* gptel.el (gptel--url-get-response,
gptel-api-key-from-auth-source): `gptel--url-get-response' accepts
a callback argument that can be used to do something besides
inserting the response into the current buffer.
* gptel-curl.el (gptel-curl--sentinel, gptel-curl-get-response):
`gptel-curl--sentinel' now accepts a callback argument that can be
used to do something besides inserting the response into the
current buffer.
These changes are in preparation for more specific functionality,
like showing the response as a message, or replacing the prompt
with the response etc.
* gptel.el (gptel--url-parse-response): Produce better error
messages when using `url-retrieve'. This includes JSON parsing
failures and insufficient quota messages.
* gptel-curl.el (gptel-curl--parse-response): Produce better error
messages when using curl. This includes JSON parsing failures
and insufficient quota messages.
* gptel.el (gptel-send, gptel--insert-response,
gptel--url-get-response): Remove aio dependency, turn aio-defuns
into regular functions. This requires splitting `gptel-send' into
"before response" and "after response" functions, but the ability
to debug the code fully is worth the inconvenience. The new "after
response" function is `gptel--insert-response'.
* gptel-curl.el (gptel-curl--sentinel, gptel-curl-get-response):
Turn aio-defuns into regular functions.
gptel.el (gptel--debug, gptel--url-parse-response): Add a debug
flag that shows the http response. Fix json parsing error.
gptel-curl.el (gptel-curl--sentinel): Ditto.
gptel.el (gptel--url-get-response): When `gptel-send' is called
directly, the API key is assumed to exist. Ensure that it is read.
gptel-curl.el (gptel-curl--get-args): Ditto.
gptel-curl.el (gptel-curl--process-alist, gptel-curl--get-args,
gptel--curl-sentinel, gptel-curl--parse-response): Rename internal
functions and variables to use the `gptel-curl--` prefix instead of
`gptel--curl-`.
gptel.el (gptel--system-message, gptel--system-message-alist,
gptel--model, gptel--temperature, gptel--max-tokens,
gptel--request-data): Add new buffer-local variables to hold API
parameters. Generating the full request data plist is now done in a
separate function, `gptel--request-data'.
gptel-curl.el (gptel-curl-get-response): Rename from `gptel--curl-get-response'
and autoload it to ease its use in `gptel-send'. Remove Version header
identifying gptel-curl as a separate package and make it require `gptel' instead.
Conditionally solves #2.
gptel.el (gptel-use-curl, gptel-parse-response, gptel--playback,
gptel-send, gptel-playback): New user options `gptel-playback',
`gptel-use-curl`. The former controls whether the response is played
back in chunks, which is done by the function `gptel--playback'. The
response returned by `gptel-get-response' and `gptel--curl-get-response'
is now a plist with the content and status.
gptel-curl.el (gptel--curl-get-args, gptel--curl-get-response,
gptel--curl-sentinel): Add support for curl when available. Set it to
the default. `url-retrieve' is full of fangs that multibyte you.