* gptel.el (gptel--convert-markdown->org,
gptel--stream-convert-markdown->org, gptel-set-topic): Move code
for transforming responses and setting the GPTEL_TOPIC property to
gptel-org. Add declarations for the byte-compiler.
* gptel-org.el (gptel-org-branching-context, gptel-org-set-topic,
gptel-org--restore-state, gptel--convert-markdown->org,
gptel--stream-convert-markdown->org): Add `gptel-org-set-topic` to
set the topic per heading in Org buffers. Fix typo in
gptel-org--restore-state. Add declarations for
the byte-compiler and page markers for the reader.
* gptel.el (gptel--insert-response): Moving point in a buffer does
not move window-point in a window displaying that buffer if the
window is not selected (#269). So select the gptel response
window explicitly (if possible) when running
`gptel-post-response-functions`.
NOTE: Instead of selecting the window before running the hook, We
could run `set-window-point` in the window after running it.
Since the hook can have any kind of behavior (smooth-scrolling for
instance) we want it to play out interactively.
* gptel-curl.el (gptel--curl-stream-cleanup): Ditto.
* gptel.el (gptel-request, gptel-send, gptel--url-get-response):
Consolidate the HTTP query construction into `gptel-request`,
and use it as the single point for sending gptel queries. This
simplifies the code, makes it easier to debug and (later) advise
or otherwise modify. To this end, `gptel-send` reuses
`gptel-request` and no longer does its own thing. Adjust other
HTTP request-related functions accordingly.
* gptel-curl.el (gptel-curl--get-args, gptel-curl--get-response):
Receive the full request data instead of constructing it partly in
`gptel-curl--get-args`.
* gptel.el (gptel--url-get-response): Log request headers as JSON objects.
* gptel-curl.el (gptel-curl--get-args, gptel-curl--get-response):
Log request headers as JSON objects, Curl command as a shell
command you can copy and paste into the terminal.
Using the libjansson JSON parser gives us a modest boost in speed.
It's not as significant a speedup as it is for LSP clients since
our jSON payloads are smaller and less frequent -- but we might as
well use it.
* gptel.el (gptel--json-read, gptel--json-encode,
gptel--url-get-response, gptel--parse-response): Define macros to
use the libjansson-supported `json-parse-buffer` and
`json-serialize`. Replace use of `json-encode` and `json-read`
appropriately.
* gptel-openai.el: (gptel-curl--parse-stream) : Use
`gptel--json-read` instead of `json-read`.
* gptel-ollama.el (gptel-curl--parse-stream): Use
`gptel--json-read` instead of `json-read`.
* gptel-gemini.el (gptel-curl--parse-stream): Use
`gptel--json-read` instead of `json-read`.
* gptel-curl.el (gptel-curl--get-args, gptel-curl--get-response,
gptel-curl--log-response, gptel-curl--stream-cleanup,
gptel-curl--parse-response): Use `gptel--json-read` and
`gptel--json-encode` in place of the json.el versions.
* gptel-anthropic.el (gptel-curl--parse-stream): Use
`gptel--json-read` instead of `json-read`.
* test/gptel-org-test.el: Use `gptel--json-read`.
* gptel.el (gptel--insert-response, gptel--restore-state): Don't
set the rear-nonsticky property on the gptel response text.
Instead, the gptel text-property is globally declared to be
nonsticky via `text-property-default-nonsticky`.
This change is since markdown-mode makes rear-nonsticky a
font-lock-managed property and we can't set it manually. Setting
this property explicitly also makes all text properties in the
range nonsticky, which can have unintended side effects.
* gptel-curl.el (gptel-curl--stream-insert-response): Ditto.
gptel.el (gptel--url-parse-response): Handle HTTP 100 followed by
200. Note: this fix is brittle, it will break if 100 is followed
by an error code.
gptel-curl.el (gptel-curl--stream-filter,
gptel-curl--parse-stream): Ditto. Address #194.
gptel-curl.el (gptel-curl--stream-cleanup): The LLM response can
be empty with HTTP status 200, for example when the API responds
with an error description instead. Handle this case gracefully.
When `gptel-mode` is enabled, also inform the user that the
response was empty. Fix#234.
gptel.el (gptel-post-response-functions): Documentation. Explain
the arguments passed to each hook function in this hook when the
response fails or is empty.
gptel-curl.el (gptel-curl--stream-cleanup,
gptel-curl--stream-insert-response): Don't consider
`gptel-response-prefix-string` part of the response for the
purpose of running `gptel-post-response-functions`.
gptel-openai.el (gptel-backend, gptel-make-openai,
gptel-make-azure): Add a curl-args slot to the backend struct for
additional Curl arguments.
Usage example: This can be used to set the `--cert` and `--key`
options in a custom backend that uses mutal TLS to communicate
with an OpenAI proxy/gateway.
gptel-curl.el (gptel-curl--get-args): Add backend-specific
curl-args when creating HTTP requests.
gptel-gemini.el (gptel-make-gemini): Add a curl-args slot to the
constructor.
gptel-kagi.el (gptel-make-kagi): Ditto.
gptel-ollama.el (gptel-make-ollama): Ditto.
* gptel.el (gptel--debug, gptel-log-level, gptel--log-buffer-name,
gptel--url-get-response, gptel--parse-response): Optionally log
all request and response data to `gptel--log-buffer-name`, with
the log level governed by `gptel-log-level`. Obsolete
`gptel--debug`.
* gptel-curl.el (gptel-curl--log-response, gptel-curl--get-args,
gptel-curl--get-response, gptel-curl--stream-cleanup,
gptel-curl--sentinel): Add support for logging.
* gptel.el (gptel--insert-response): Turn on visual-line-mode in
the response buffer that is created when the gptel buffer is
read-only.
* gptel-curl.el (gptel-curl--stream-insert-response): Ditto.
* gptel.el (gptel-end-of-response, gptel-post-response-hook,
gptel-post-response-functions, gptel--insert-response,
gptel-response-filter-functions):
Rename gptel-post-response-hook -> gptel-post-response-functions
The new abnormal hook now calls its functions with the start and
end positions of the response, to make it easier to act on the
response.
* gptel-curl.el (gptel-curl--stream-cleanup): Corresponding changes.
* README.org: Mention breaking change.
* gptel.el (gptel--url-get-response): If the backend-url is a
function, call it to find the full url to query.
* gptel-gemini.el: Gemini uses different urls for
streaming/oneshot responses. Set the backend-url to a function to
account for the value of gptel-stream. This is also safer than
before as the API key is not stored as part of a static url string
in memory. Fix#153.
* gptel-curl.el (gptel-curl--get-args): If the backend-url is a
function, call it to find the full url to query.
README: Mention `gptel-update-destination` in README.
gptel.el (gptel-update-destination, gptel--update-status,
gptel-send, gptel--insert-response): New option
`gptel-update-destination` to control how gptel's status messages
are shown. `gptel--update-status` replaces
`gptel--update-header-line`. Replace calls to this function
elsewhere in gptel.el.
gptel-curl.el (gptel-abort, gptel-curl--stream-cleanup,
gptel-curl--stream-insert-response): Use `gptel--update-status` in
place of `gptel--update-header-line`.
gptel-transient.el (gptel--suffix-send): Use
`gptel--update-status` in place of `gptel--update-header-line`.
* gptel-curl.el (gptel-curl--sentinel, gptel-curl--stream-filter):
Remove redundant calls to `gptel-curl--stream-insert-response`
when the response being inserted is nil or a blank string. This
should be a modest boost to streaming performance.
* gptel.el (gptel-auto-scroll, gptel-end-of-response,
gptel-post-response-hook, gptel-post-stream-hook): Add
`gptel-post-stream-hook` that runs after each text insertion when
streaming responses. This can be used to, for instance,
auto-scroll the window as the response continues below the
viewport. The utility function `gptel-auto-scroll` does this.
Provide a utility command `gptel-end-of-response`, which moves the
cursor to the end of the response when it is in or before it.
* gptel-curl.el (gptel-curl--stream-insert-response): Run
`gptel-post-stream-hook` where required.
* README: Add FAQ, simplify structure, mention the new hooks and
scrolling/navigation options.
* gptel-curl.el (gptel-curl--common-args): Following the
discussion in #143, Use "-y300 -Y1" as Curl arguments instead of
specifying the timeout. Now the connection stays open unless less
than 1 byte of information is exchanged over 300 seconds.
gptel: Add customizable prompt/response prefixes
gptel.el (gptel-prompt-prefix-alist, gptel-response-prefix-alist,
gptel-prompt-prefix-string, gptel-response-prefix-string,
gptel--url-get-response): Add customizable response prefixes (per
major-mode) in `gptel-response-prefix-alist`.
Rename `gptel-prompt-string` -> `gptel-prompt-prefix-string`
The function `gptel-response-prefix-string` returns the prefix
string for the response in the current major-mode.
gptel-openai.el, gptel-ollama.el (gptel--parse-buffer): Remove the
prompt and response prefixes when creating prompt strings to send
to the LLM API.
gptel-curl.el (gptel-curl--stream-cleanup,
gptel-curl--stream-insert-response): Insert the response prefix
for the current major-mode before inserting the LLM API response.
gptel-curl.el (gptel-curl--get-args,
gptel-curl-file-size-threshold): Use temporary file for curl data.
Ensure curl uses a temporary file for binary data to prevent
issues with large payloads and special characters:
- Add a new defcustom `gptel-curl-file-size-threshold` to
determine when to use a temporary file for passing data to Curl.
- Use `--data-binary` with a temp file for data larger than the
specified threshold, improving handling of large data payloads in
GPTel queries.
- Reliably clean up temporary files created for Curl requests
exceeding the size threshold. Add a function to
`gptel-post-response-hook` to delete the file post-Curl execution
and remove itself from the hook, preventing temporary file
accumulation.
gptel-curl.el (gptel-curl--common-args, gptel-curl--get-args):
Don't use compression with Curl on Windows, since it seems to
be generally not supported. Fix#90.
* gptel.el (gptel--url-get-response, gptel--url-parse-response):
- When the query fails, the error message format (in the JSON)
differs between APIs. Ultimately it may be required to dispatch
error handling via a generic function, but for now: try to make
the error handling API agnostic.
- Mention the backend name in the error message. Pass the backend
to the (non-streaming response) parsers to be able to do this.
* gptel-curl.el (gptel-curl--stream-cleanup,
gptel-curl--parse-response): Same changes.
gptel-curl.arg (gptel-curl--get-args): Increase curl timeout.
Often local LLMs will offload a query to CPU if there is not enough VRAM or in
the case of an unsupported GPU. When a query is offloaded to the CPU responses
can be significantly slower. If curl times out early the user will not get the
response from the LLM back in Emacs.
This change increases the timeout for curl from 60s to 300s to make gptel usable
in slower environments.
Closes#125
README.org: Update README with new information and a multi-llm demo.
gptel.el (gptel-host, gptel--known-backends, gptel--api-key,
gptel--create-prompt, gptel--request-data, gptel--parse-buffer, gptel-request,
gptel--parse-response, gptel--openai, gptel--debug, gptel--restore-state,
gptel, gptel-backend):
Integrate multiple LLMs through the introcution of gptel-backends. Each backend
is composed of two pieces:
1. An instance of a cl-struct, containing connection, authentication and model
information. See the cl-struct `gptel-backend` for details. A separate
cl-struct type is defined for each supported backend (OpenAI, Azure, GPT4All and
Ollama) that inherits from the generic gptel-backend type.
2. cl-generic implementations of specific tasks, like gathering up and
formatting context (previous user queries and LLM responses), parsing responses
or responses streams etc. The four tasks currently specialized this way are
carried out by `gptel--parse-buffer` and `gptel--request-data` (for constructing
the query) and `gptel--parse-response` and `gptel-curl--parse-stream` (for
parsing the response). See their implementations for details. Some effort has
been made to limit the number of times dispatching is done when reading
streaming responses.
When a backend is created, it is registered in the collection
`gptel--known-backends` and can be accessed by name later, such as from the
transient menu.
Only one of these backends is active at any time in a buffer, stored in the
buffer-local variable `gptel-backend`. Most messaging, authentication etc
accounts for the active backend, although there might be some leftovers.
When using `gptel-request` or `gptel-send`, the active backend can be changed or
let-bound.
- Obsolete `gptel-host`
- Fix the rear-sticky property when restoring sessions from files.
- Document some variables (not user options), like `gptel--debug`
gptel-openai.el (gptel-backend, gptel-make-openai, gptel-make-azure,
gptel-make-gpt4all): This file (currently always loaded) sets up the generic
backend struct and includes constructors for creating OpenAI, GPT4All and Azure
backends. They all use the same API so a single set of defgeneric
implemenations suffices for all of them.
gptel-ollama.el (gptel-make-ollama): This file includes the cl-struct,
constructor and requisite defgeneric implementations for Ollama support.
gptel-transient.el (gptel-menu, gptel-provider-variable, gptel--infix-provider,
gptel-suffix-send):
- Provide access to all available LLM backends and models from `gptel-menu`.
- Adjust keybindings in gptel-menu: setting the model and query parameters is
now bound to two char keybinds, while redirecting input and output is bound to
single keys.
gptel.el (gptel--insert-response):
gptel-curl.el (gptel-curl--stream-insert-response): Make the `gptel'
text-property rear-nonsticky so typing after it is recognized as part of the
user prompt.
* gptel.el (gptel--insert-response, gptel-pre-response-hook): New
user option `gptel-pre-response-hook' that runs before the
response is inserted into the buffer. This can be used to prepare
the buffer in some user-specified way for the response.
* gptel-curl.el (gptel-curl--stream-filter): Run
`gptel-pre-response-hook' before inserting streaming responses.
* gptel-curl.el (gptel-curl-get-response): Don't convert response
into org-mode unless the buffer from which the request originated
is in org-mode. This makes `gptel-default-mode' less binding, and
only used when creating a new chat session with `gptel'. Also,
gptel should now do the right thing depending on whether the
current buffer is in text, Markdown or Org modes.
* gptel.el (gptel-proxy): Support a proxy when interacting with openai
endpoint. In many organizations the openai api can only be accessed
via proxy. This is easily supported by curl.
gptel-curl.el (gptel-curl--get-args): tidy up `gptel-curl--get-args'.
---------
Co-authored-by: PalaceChan <XXX>
* gptel-curl.el (gptel-curl--stream-cleanup): `gptel-post-response-hook' should
run in the buffer that was current when the request was sent. This was not the
case for the curl method (with response streaming). Fixed.
* gptel.el (gptel--insert-response):
* gptel-transient.el (gptel--suffix-send):
* gptel-curl.el (gptel-curl--stream-filter, gptel-curl--stream-insert-response,
gptel-curl--stream-cleanup):
Handle read-only gptel buffers by redirecting the output to a new buffer (that
pops up automatically). To track this,
- the `:position' argument of the INFO plist, which is a marker, is moved to the
new output buffer.
- the `:buffer' argument of the INFO plist is unmodified, it always points to
the buffer that the request originated from.
* gptel-curl.el (gptel-curl-get-response): Set buffer-local model parameters in
the correct (i.e. gptel) buffer, not in Curl's process buffer. This fixes#43.
* gptel.el (gptel--insert-response, gptel-request):
- Add an in-place key to gptel-request. When true, the default
callbacks will not delimit the API responses with newlines.
- Add a strea option to gptel-request. Only works with the default
filter/stream-insert callback, so it's marked as for internal use
for now.
* gptel-curl.el (gptel-curl--stream-insert-response): Ditto.
* gptel.el (gptel--url-parse-response, gptel--url-get-response,
gptel--insert-response, gptel-send):
- Use shorter keys for passing the info plist,
- record errors in the info plist,
- separate user messaging from the callback and more.
- Make the API more functional (i.e. less imperative)
This is in preparation for adding `gptel-request', an API for
defining custom commands.
Note: The streaming filter and callback are mostly unchanged.
Streaming is not planned to be accessible via `gptel-request'.
* gptel-curl.el (gptel-curl--parse-response, gptel-curl--sentinel,
gptel-curl--stream-filter, gptel-curl--stream-insert-response,
gptel-curl--stream-cleanup, gptel-curl-get-response): Ditto.
* gptel.el (gptel--url-parse-response, gptel--insert-response):
Use the same error codes/descriptions across url-retrieve/Curl,
with and without streaming responses.
* gptel-curl.el (gptel-curl--parse-response,
gptel-curl--stream-filter, gptel-curl--stream-cleanup): Ditto.
* gptel-curl.el (gptel-curl--stream-filter,
gptel-curl--stream-cleanup): When streaming responses, move the
error handling from the Curl process filter to the cleanup
sentinel. This simplifies the filter code a fair bit.
* gptel.el (gptel--url-get-response, gptel--insert-response,
gptel-send): Rename the :insert-marker keyword in the async info
plist to :start-marker.
* gptel-curl.el (gptel--insert-response-stream,
gptel-curl--cleanup-stream, gptel-curl-get-response): Ditto.
* gptel.el (gptel--convert-playback-markdown->org): New converter
for markdown->org that works on text chunks while maintaining the
parse state until the text stream is finished.
* gptel-curl.el (gptel--insert-response-stream,
gptel-curl-get-response): When using `gptel-playback' and
requesting ChatGPT's responses in org-mode, run the above
converter on the received response. This works by storing the
converter and associated state as a closure in the async info
plist that is supplied along with the response, and running it
repeatedly on each chunk of text in the response stream before it
is inserted into the buffer.
FIXME: Note that `gptel-response-filter-functions' is currently
ignored if using `gptel-stream'.
* gptel.el (gptel--request-data): Request a streaming message if
`gptel-stream' is non-nil.
* gptel-curl.el (gptel-curl-get-response,
gptel-curl--cleanup-stream, gptel-curl--filter): Add a process
filter and sentinel for Curl to stream ChatGPT's response into
Emacs in real-time.
* gptel.el (gptel--url-get-response,
gptel-api-key-from-auth-source): `gptel--url-get-response' accepts
a callback argument that can be used to do something besides
inserting the response into the current buffer.
* gptel-curl.el (gptel-curl--sentinel, gptel-curl-get-response):
`gptel-curl--sentinel' now accepts a callback argument that can be
used to do something besides inserting the response into the
current buffer.
These changes are in preparation for more specific functionality,
like showing the response as a message, or replacing the prompt
with the response etc.
* gptel.el (gptel--url-parse-response): Produce better error
messages when using `url-retrieve'. This includes JSON parsing
failures and insufficient quota messages.
* gptel-curl.el (gptel-curl--parse-response): Produce better error
messages when using curl. This includes JSON parsing failures
and insufficient quota messages.