* gptel.el (gptel-end-of-response, gptel-post-response-hook, gptel-post-response-functions, gptel--insert-response, gptel-response-filter-functions): Rename gptel-post-response-hook -> gptel-post-response-functions The new abnormal hook now calls its functions with the start and end positions of the response, to make it easier to act on the response. * gptel-curl.el (gptel-curl--stream-cleanup): Corresponding changes. * README.org: Mention breaking change.
440 lines
20 KiB
Org Mode
440 lines
20 KiB
Org Mode
#+title: GPTel: A simple LLM client for Emacs
|
|
|
|
[[https://melpa.org/#/gptel][file:https://melpa.org/packages/gptel-badge.svg]]
|
|
|
|
GPTel is a simple Large Language Model chat client for Emacs, with support for multiple models/backends.
|
|
|
|
| LLM Backend | Supports | Requires |
|
|
|-------------+----------+---------------------------|
|
|
| ChatGPT | ✓ | [[https://platform.openai.com/account/api-keys][API key]] |
|
|
| Azure | ✓ | Deployment and API key |
|
|
| Ollama | ✓ | [[https://ollama.ai/][Ollama running locally]] |
|
|
| GPT4All | ✓ | [[https://gpt4all.io/index.html][GPT4All running locally]] |
|
|
| Gemini | ✓ | [[https://makersuite.google.com/app/apikey][API key]] |
|
|
| Llama.cpp | ✓ | [[https://github.com/ggerganov/llama.cpp/tree/master/examples/server#quick-start][Llama.cpp running locally]] |
|
|
| Llamafile | ✓ | [[https://github.com/Mozilla-Ocho/llamafile#quickstart][Local Llamafile server]] |
|
|
| PrivateGPT | Planned | - |
|
|
|
|
*General usage*: ([[https://www.youtube.com/watch?v=bsRnh_brggM][YouTube Demo]])
|
|
|
|
https://user-images.githubusercontent.com/8607532/230516812-86510a09-a2fb-4cbd-b53f-cc2522d05a13.mp4
|
|
|
|
https://user-images.githubusercontent.com/8607532/230516816-ae4a613a-4d01-4073-ad3f-b66fa73c6e45.mp4
|
|
|
|
*Multi-LLM support demo*:
|
|
|
|
https://github-production-user-asset-6210df.s3.amazonaws.com/8607532/278854024-ae1336c4-5b87-41f2-83e9-e415349d6a43.mp4
|
|
|
|
- It's async and fast, streams responses.
|
|
- Interact with LLMs from anywhere in Emacs (any buffer, shell, minibuffer, wherever)
|
|
- LLM responses are in Markdown or Org markup.
|
|
- Supports conversations and multiple independent sessions.
|
|
- Save chats as regular Markdown/Org/Text files and resume them later.
|
|
- You can go back and edit your previous prompts or LLM responses when continuing a conversation. These will be fed back to the model.
|
|
|
|
GPTel uses Curl if available, but falls back to url-retrieve to work without external dependencies.
|
|
|
|
** Contents :toc:
|
|
- [[#installation][Installation]]
|
|
- [[#straight][Straight]]
|
|
- [[#manual][Manual]]
|
|
- [[#doom-emacs][Doom Emacs]]
|
|
- [[#spacemacs][Spacemacs]]
|
|
- [[#setup][Setup]]
|
|
- [[#chatgpt][ChatGPT]]
|
|
- [[#other-llm-backends][Other LLM backends]]
|
|
- [[#azure][Azure]]
|
|
- [[#gpt4all][GPT4All]]
|
|
- [[#ollama][Ollama]]
|
|
- [[#gemini][Gemini]]
|
|
- [[#llamacpp-or-llamafile][Llama.cpp or Llamafile]]
|
|
- [[#usage][Usage]]
|
|
- [[#in-any-buffer][In any buffer:]]
|
|
- [[#in-a-dedicated-chat-buffer][In a dedicated chat buffer:]]
|
|
- [[#save-and-restore-your-chat-sessions][Save and restore your chat sessions]]
|
|
- [[#faq][FAQ]]
|
|
- [[#i-want-the-window-to-scroll-automatically-as-the-response-is-inserted][I want the window to scroll automatically as the response is inserted]]
|
|
- [[#i-want-the-cursor-to-move-to-the-next-prompt-after-the-response-is-inserted][I want the cursor to move to the next prompt after the response is inserted]]
|
|
- [[#i-want-to-change-the-prefix-before-the-prompt-and-response][I want to change the prefix before the prompt and response]]
|
|
- [[#why-another-llm-client][Why another LLM client?]]
|
|
- [[#additional-configuration][Additional Configuration]]
|
|
- [[#the-gptel-api][The gptel API]]
|
|
- [[#extensions-using-gptel][Extensions using GPTel]]
|
|
- [[#alternatives][Alternatives]]
|
|
- [[#breaking-changes][Breaking Changes]]
|
|
- [[#acknowledgments][Acknowledgments]]
|
|
|
|
** Installation
|
|
|
|
GPTel is on MELPA. Ensure that MELPA is in your list of sources, then install gptel with =M-x package-install⏎= =gptel=.
|
|
|
|
(Optional: Install =markdown-mode=.)
|
|
|
|
#+html: <details><summary>
|
|
**** Straight
|
|
#+html: </summary>
|
|
#+begin_src emacs-lisp
|
|
(straight-use-package 'gptel)
|
|
#+end_src
|
|
|
|
Installing the =markdown-mode= package is optional.
|
|
#+html: </details>
|
|
#+html: <details><summary>
|
|
**** Manual
|
|
#+html: </summary>
|
|
Clone or download this repository and run =M-x package-install-file⏎= on the repository directory.
|
|
|
|
Installing the =markdown-mode= package is optional.
|
|
#+html: </details>
|
|
#+html: <details><summary>
|
|
**** Doom Emacs
|
|
#+html: </summary>
|
|
In =packages.el=
|
|
#+begin_src emacs-lisp
|
|
(package! gptel)
|
|
#+end_src
|
|
|
|
In =config.el=
|
|
#+begin_src emacs-lisp
|
|
(use-package! gptel
|
|
:config
|
|
(setq! gptel-api-key "your key"))
|
|
#+end_src
|
|
#+html: </details>
|
|
#+html: <details><summary>
|
|
**** Spacemacs
|
|
#+html: </summary>
|
|
After installation with =M-x package-install⏎= =gptel=
|
|
|
|
- Add =gptel= to =dotspacemacs-additional-packages=
|
|
- Add =(require 'gptel)= to =dotspacemacs/user-config=
|
|
#+html: </details>
|
|
** Setup
|
|
*** ChatGPT
|
|
Procure an [[https://platform.openai.com/account/api-keys][OpenAI API key]].
|
|
|
|
Optional: Set =gptel-api-key= to the key. Alternatively, you may choose a more secure method such as:
|
|
|
|
- Storing in =~/.authinfo=. By default, "api.openai.com" is used as HOST and "apikey" as USER.
|
|
#+begin_src authinfo
|
|
machine api.openai.com login apikey password TOKEN
|
|
#+end_src
|
|
- Setting it to a function that returns the key.
|
|
|
|
*** Other LLM backends
|
|
#+html: <details><summary>
|
|
**** Azure
|
|
#+html: </summary>
|
|
|
|
Register a backend with
|
|
#+begin_src emacs-lisp
|
|
(gptel-make-azure
|
|
"Azure-1" ;Name, whatever you'd like
|
|
:protocol "https" ;optional -- https is the default
|
|
:host "YOUR_RESOURCE_NAME.openai.azure.com"
|
|
:endpoint "/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2023-05-15" ;or equivalent
|
|
:stream t ;Enable streaming responses
|
|
:key #'gptel-api-key
|
|
:models '("gpt-3.5-turbo" "gpt-4"))
|
|
#+end_src
|
|
Refer to the documentation of =gptel-make-azure= to set more parameters.
|
|
|
|
You can pick this backend from the menu when using gptel. (see [[#usage][Usage]])
|
|
|
|
If you want it to be the default, set it as the default value of =gptel-backend=:
|
|
#+begin_src emacs-lisp
|
|
(setq-default gptel-backend
|
|
(gptel-make-azure
|
|
"Azure-1"
|
|
...))
|
|
#+end_src
|
|
#+html: </details>
|
|
|
|
#+html: <details><summary>
|
|
**** GPT4All
|
|
#+html: </summary>
|
|
|
|
Register a backend with
|
|
#+begin_src emacs-lisp
|
|
(gptel-make-gpt4all
|
|
"GPT4All" ;Name of your choosing
|
|
:protocol "http"
|
|
:host "localhost:4891" ;Where it's running
|
|
:models '("mistral-7b-openorca.Q4_0.gguf")) ;Available models
|
|
#+end_src
|
|
These are the required parameters, refer to the documentation of =gptel-make-gpt4all= for more.
|
|
|
|
You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=. Additionally you may want to increase the response token size since GPT4All uses very short (often truncated) responses by default:
|
|
|
|
#+begin_src emacs-lisp
|
|
;; OPTIONAL configuration
|
|
(setq-default gptel-model "mistral-7b-openorca.Q4_0.gguf" ;Pick your default model
|
|
gptel-backend (gptel-make-gpt4all "GPT4All" :protocol ...))
|
|
(setq-default gptel-max-tokens 500)
|
|
#+end_src
|
|
|
|
#+html: </details>
|
|
|
|
#+html: <details><summary>
|
|
**** Ollama
|
|
#+html: </summary>
|
|
|
|
Register a backend with
|
|
#+begin_src emacs-lisp
|
|
(gptel-make-ollama
|
|
"Ollama" ;Any name of your choosing
|
|
:host "localhost:11434" ;Where it's running
|
|
:models '("mistral:latest") ;Installed models
|
|
:stream t) ;Stream responses
|
|
#+end_src
|
|
These are the required parameters, refer to the documentation of =gptel-make-ollama= for more.
|
|
|
|
You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=:
|
|
|
|
#+begin_src emacs-lisp
|
|
;; OPTIONAL configuration
|
|
(setq-default gptel-model "mistral:latest" ;Pick your default model
|
|
gptel-backend (gptel-make-ollama "Ollama" :host ...))
|
|
#+end_src
|
|
|
|
#+html: </details>
|
|
|
|
#+html: <details><summary>
|
|
**** Gemini
|
|
#+html: </summary>
|
|
|
|
Register a backend with
|
|
#+begin_src emacs-lisp
|
|
;; :key can be a function that returns the API key.
|
|
(gptel-make-gemini
|
|
"Gemini"
|
|
:key "YOUR_GEMINI_API_KEY"
|
|
:stream t)
|
|
#+end_src
|
|
These are the required parameters, refer to the documentation of =gptel-make-gemini= for more.
|
|
|
|
You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=:
|
|
|
|
#+begin_src emacs-lisp
|
|
;; OPTIONAL configuration
|
|
(setq-default gptel-model "gemini-pro" ;Pick your default model
|
|
gptel-backend (gptel-make-gemini "Gemini" :host ...))
|
|
#+end_src
|
|
|
|
#+html: </details>
|
|
|
|
#+html: <details>
|
|
#+html: <summary>
|
|
**** Llama.cpp or Llamafile
|
|
#+html: </summary>
|
|
|
|
(If using a llamafile, run a [[https://github.com/Mozilla-Ocho/llamafile#other-example-llamafiles][server llamafile]] instead of a "command-line llamafile", and a model that supports text generation.)
|
|
|
|
Register a backend with
|
|
#+begin_src emacs-lisp
|
|
(gptel-make-openai ;Not a typo, same API as OpenAI
|
|
"llama-cpp" ;Any name
|
|
:stream t ;Stream responses
|
|
:protocol "http"
|
|
:host "localhost:8000" ;Llama.cpp server location, typically localhost:8080 for Llamafile
|
|
:key nil ;No key needed
|
|
:models '("test")) ;Any names, doesn't matter for Llama
|
|
#+end_src
|
|
These are the required parameters, refer to the documentation of =gptel-make-openai= for more.
|
|
|
|
You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=:
|
|
#+begin_src emacs-lisp
|
|
(setq-default gptel-backend (gptel-make-openai "llama-cpp" ...)
|
|
gptel-model "test")
|
|
#+end_src
|
|
|
|
#+html: </details>
|
|
** Usage
|
|
|
|
(This is also a [[https://www.youtube.com/watch?v=bsRnh_brggM][video demo]] showing various uses of gptel.)
|
|
|
|
|-------------------+-------------------------------------------------------------------------|
|
|
| *Command* | Description |
|
|
|-------------------+-------------------------------------------------------------------------|
|
|
| =gptel-send= | Send conversation up to =(point)=, or selection if region is active. Works anywhere in Emacs. |
|
|
| =gptel= | Create a new dedicated chat buffer. Not required to use gptel. |
|
|
| =C-u= =gptel-send= | Transient menu for preferences, input/output redirection etc. |
|
|
| =gptel-menu= | /(Same)/ |
|
|
|-------------------+-------------------------------------------------------------------------|
|
|
| =gptel-set-topic= | /(Org-mode only)/ Limit conversation context to an Org heading |
|
|
|-------------------+-------------------------------------------------------------------------|
|
|
|
|
*** In any buffer:
|
|
|
|
1. Call =M-x gptel-send= to send the text up to the cursor. The response will be inserted below. Continue the conversation by typing below the response.
|
|
|
|
2. If a region is selected, the conversation will be limited to its contents.
|
|
|
|
3. Call =M-x gptel-send= with a prefix argument to
|
|
- set chat parameters (GPT model, directives etc) for this buffer,
|
|
- to read the prompt from elsewhere or redirect the response elsewhere,
|
|
- or to replace the prompt with the response.
|
|
|
|
[[https://user-images.githubusercontent.com/8607532/230770018-9ce87644-6c17-44af-bd39-8c899303dce1.png]]
|
|
|
|
With a region selected, you can also rewrite prose or refactor code from here:
|
|
|
|
*Code*:
|
|
|
|
[[https://user-images.githubusercontent.com/8607532/230770162-1a5a496c-ee57-4a67-9c95-d45f238544ae.png]]
|
|
|
|
*Prose*:
|
|
|
|
[[https://user-images.githubusercontent.com/8607532/230770352-ee6f45a3-a083-4cf0-b13c-619f7710e9ba.png]]
|
|
|
|
*** In a dedicated chat buffer:
|
|
|
|
1. Run =M-x gptel= to start or switch to the chat buffer. It will ask you for the key if you skipped the previous step. Run it with a prefix-arg (=C-u M-x gptel=) to start a new session.
|
|
|
|
2. In the gptel buffer, send your prompt with =M-x gptel-send=, bound to =C-c RET=.
|
|
|
|
3. Set chat parameters (LLM provider, model, directives etc) for the session by calling =gptel-send= with a prefix argument (=C-u C-c RET=):
|
|
|
|
[[https://user-images.githubusercontent.com/8607532/224946059-9b918810-ab8b-46a6-b917-549d50c908f2.png]]
|
|
|
|
That's it. You can go back and edit previous prompts and responses if you want.
|
|
|
|
The default mode is =markdown-mode= if available, else =text-mode=. You can set =gptel-default-mode= to =org-mode= if desired.
|
|
|
|
**** Save and restore your chat sessions
|
|
|
|
Saving the file will save the state of the conversation as well. To resume the chat, open the file and turn on =gptel-mode= before editing the buffer.
|
|
|
|
** FAQ
|
|
*** I want the window to scroll automatically as the response is inserted
|
|
|
|
To be minimally annoying, GPTel does not move the cursor by default. Add the following to your configuration to enable auto-scrolling.
|
|
|
|
#+begin_src emacs-lisp
|
|
(add-hook 'gptel-post-stream-hook 'gptel-auto-scroll)
|
|
#+end_src
|
|
|
|
*** I want the cursor to move to the next prompt after the response is inserted
|
|
|
|
To be minimally annoying, GPTel does not move the cursor by default. Add the following to your configuration to move the cursor:
|
|
|
|
#+begin_src emacs-lisp
|
|
(add-hook 'gptel-post-response-functions 'gptel-end-of-response)
|
|
#+end_src
|
|
|
|
You can also call =gptel-end-of-response= as a command at any time.
|
|
|
|
*** I want to change the prefix before the prompt and response
|
|
|
|
Customize =gptel-prompt-prefix-alist= and =gptel-response-prefix-alist=. You can set a different pair for each major-mode.
|
|
|
|
*** Why another LLM client?
|
|
|
|
Other Emacs clients for LLMs prescribe the format of the interaction (a comint shell, org-babel blocks, etc). I wanted:
|
|
|
|
1. Something that is as free-form as possible: query the model using any text in any buffer, and redirect the response as required. Using a dedicated =gptel= buffer just adds some visual flair to the interaction.
|
|
2. Integration with org-mode, not using a walled-off org-babel block, but as regular text. This way the model can generate code blocks that I can run.
|
|
|
|
** Additional Configuration
|
|
:PROPERTIES:
|
|
:ID: f885adac-58a3-4eba-a6b7-91e9e7a17829
|
|
:END:
|
|
|
|
#+begin_src emacs-lisp :exports none :results list
|
|
(let ((all))
|
|
(mapatoms (lambda (sym)
|
|
(when (and (string-match-p "^gptel-[^-]" (symbol-name sym))
|
|
(get sym 'variable-documentation))
|
|
(push sym all))))
|
|
all)
|
|
#+end_src
|
|
|
|
|---------------------------+---------------------------------------------------------------------|
|
|
| *Connection options* | |
|
|
|---------------------------+---------------------------------------------------------------------|
|
|
| =gptel-use-curl= | Use Curl (default), fallback to Emacs' built-in =url=. |
|
|
| =gptel-proxy= | Proxy server for requests, passed to curl via =--proxy=. |
|
|
| =gptel-api-key= | Variable/function that returns the API key for the active backend. |
|
|
|---------------------------+---------------------------------------------------------------------|
|
|
|
|
|-------------------+---------------------------------------------------------|
|
|
| *LLM options* | /(Note: not supported uniformly across LLMs)/ |
|
|
|-------------------+---------------------------------------------------------|
|
|
| =gptel-backend= | Default LLM Backend. |
|
|
| =gptel-model= | Default model to use, depends on the backend. |
|
|
| =gptel-stream= | Enable streaming responses, if the backend supports it. |
|
|
| =gptel-directives= | Alist of system directives, can switch on the fly. |
|
|
| =gptel-max-tokens= | Maximum token count (in query + response). |
|
|
| =gptel-temperature= | Randomness in response text, 0 to 2. |
|
|
|-------------------+---------------------------------------------------------|
|
|
|
|
|-----------------------------+----------------------------------------|
|
|
| *Chat UI options* | |
|
|
|-----------------------------+----------------------------------------|
|
|
| =gptel-default-mode= | Major mode for dedicated chat buffers. |
|
|
| =gptel-prompt-prefix-alist= | Text inserted before queries. |
|
|
| =gptel-response-prefix-alist= | Text inserted before responses. |
|
|
| =gptel-use-header-line= | Display status messages in header-line (default) or minibuffer |
|
|
|-----------------------------+----------------------------------------|
|
|
|
|
** COMMENT Will you add feature X?
|
|
|
|
Maybe, I'd like to experiment a bit more first. Features added since the inception of this package include
|
|
- Curl support (=gptel-use-curl=)
|
|
- Streaming responses (=gptel-stream=)
|
|
- Cancelling requests in progress (=gptel-abort=)
|
|
- General API for writing your own commands (=gptel-request=, [[https://github.com/karthink/gptel/wiki][wiki]])
|
|
- Dispatch menus using Transient (=gptel-send= with a prefix arg)
|
|
- Specifying the conversation context size
|
|
- GPT-4 support
|
|
- Response redirection (to the echo area, another buffer, etc)
|
|
- A built-in refactor/rewrite prompt
|
|
- Limiting conversation context to Org headings using properties (#58)
|
|
- Saving and restoring chats (#17)
|
|
- Support for local LLMs.
|
|
|
|
Features being considered or in the pipeline:
|
|
- Fully stateless design (#17)
|
|
|
|
** The gptel API
|
|
|
|
GPTel's default usage pattern is simple, and will stay this way: Read input in any buffer and insert the response below it. Some custom behavior is possible with the transient menu (=C-u M-x gptel-send=).
|
|
|
|
For more programmable usage, gptel provides a general =gptel-request= function that accepts a custom prompt and a callback to act on the response. You can use this to build custom workflows not supported by =gptel-send=. See the documentation of =gptel-request=, and the [[https://github.com/karthink/gptel/wiki][wiki]] for examples.
|
|
|
|
*** Extensions using GPTel
|
|
|
|
These are packages that depend on GPTel to provide additional functionality
|
|
|
|
- [[https://github.com/kamushadenes/gptel-extensions.el][gptel-extensions]]: Extra utility functions for GPTel.
|
|
- [[https://github.com/kamushadenes/ai-blog.el][ai-blog.el]]: Streamline generation of blog posts in Hugo.
|
|
|
|
** Alternatives
|
|
|
|
Other Emacs clients for LLMs include
|
|
|
|
- [[https://github.com/xenodium/chatgpt-shell][chatgpt-shell]]: comint-shell based interaction with ChatGPT. Also supports DALL-E, executable code blocks in the responses, and more.
|
|
- [[https://github.com/rksm/org-ai][org-ai]]: Interaction through special =#+begin_ai ... #+end_ai= Org-mode blocks. Also supports DALL-E, querying ChatGPT with the contents of project files, and more.
|
|
|
|
There are several more: [[https://github.com/CarlQLange/chatgpt-arcana.el][chatgpt-arcana]], [[https://github.com/MichaelBurge/leafy-mode][leafy-mode]], [[https://github.com/iwahbe/chat.el][chat.el]]
|
|
|
|
** Breaking Changes
|
|
|
|
- =gptel-post-response-hook= has been renamed to =gptel-post-response-functions=, and functions in this hook are now called with two arguments: the start and end buffer positions of the response. This should make it easy to act on the response text without having to locate it first.
|
|
|
|
- Possible breakage, see #120: If streaming responses stop working for you after upgrading to v0.5, try reinstalling gptel and deleting its native comp eln cache in =native-comp-eln-load-path=.
|
|
|
|
- The user option =gptel-host= is deprecated. If the defaults don't work for you, use =gptel-make-openai= (which see) to customize server settings.
|
|
|
|
- =gptel-api-key-from-auth-source= now searches for the API key using the host address for the active LLM backend, /i.e./ "api.openai.com" when using ChatGPT. You may need to update your =~/.authinfo=.
|
|
|
|
** Acknowledgments
|
|
|
|
- [[https://github.com/algal][Alexis Gallagher]] and [[https://github.com/d1egoaz][Diego Alvarez]] for fixing a nasty multi-byte bug with =url-retrieve=.
|
|
- [[https://github.com/tarsius][Jonas Bernoulli]] for the Transient library.
|
|
|
|
|
|
# Local Variables:
|
|
# toc-org-max-depth: 4
|
|
# eval: (and (fboundp 'toc-org-mode) (toc-org-mode 1))
|
|
# End:
|