#+title: GPTel: A simple LLM client for Emacs [[https://melpa.org/#/gptel][file:https://melpa.org/packages/gptel-badge.svg]] GPTel is a simple Large Language Model chat client for Emacs, with support for multiple models and backends. | LLM Backend | Supports | Requires | |-----------------+----------+---------------------------| | ChatGPT | ✓ | [[https://platform.openai.com/account/api-keys][API key]] | | Azure | ✓ | Deployment and API key | | Ollama | ✓ | [[https://ollama.ai/][Ollama running locally]] | | GPT4All | ✓ | [[https://gpt4all.io/index.html][GPT4All running locally]] | | Gemini | ✓ | [[https://makersuite.google.com/app/apikey][API key]] | | Llama.cpp | ✓ | [[https://github.com/ggerganov/llama.cpp/tree/master/examples/server#quick-start][Llama.cpp running locally]] | | Llamafile | ✓ | [[https://github.com/Mozilla-Ocho/llamafile#quickstart][Local Llamafile server]] | | Kagi FastGPT | ✓ | [[https://kagi.com/settings?p=api][API key]] | | Kagi Summarizer | ✓ | [[https://kagi.com/settings?p=api][API key]] | | together.ai | ✓ | [[https://api.together.xyz/settings/api-keys][API key]] | | Anyscale | ✓ | [[https://docs.endpoints.anyscale.com/][API key]] | *General usage*: ([[https://www.youtube.com/watch?v=bsRnh_brggM][YouTube Demo]]) https://user-images.githubusercontent.com/8607532/230516812-86510a09-a2fb-4cbd-b53f-cc2522d05a13.mp4 https://user-images.githubusercontent.com/8607532/230516816-ae4a613a-4d01-4073-ad3f-b66fa73c6e45.mp4 *Multi-LLM support demo*: https://github-production-user-asset-6210df.s3.amazonaws.com/8607532/278854024-ae1336c4-5b87-41f2-83e9-e415349d6a43.mp4 - It's async and fast, streams responses. - Interact with LLMs from anywhere in Emacs (any buffer, shell, minibuffer, wherever) - LLM responses are in Markdown or Org markup. - Supports conversations and multiple independent sessions. - Save chats as regular Markdown/Org/Text files and resume them later. - You can go back and edit your previous prompts or LLM responses when continuing a conversation. These will be fed back to the model. - Don't like gptel's workflow? Use it to create your own for any supported model/backend with a [[https://github.com/karthink/gptel/wiki#defining-custom-gptel-commands][simple API]]. GPTel uses Curl if available, but falls back to url-retrieve to work without external dependencies. ** Contents :toc: - [[#installation][Installation]] - [[#straight][Straight]] - [[#manual][Manual]] - [[#doom-emacs][Doom Emacs]] - [[#spacemacs][Spacemacs]] - [[#setup][Setup]] - [[#chatgpt][ChatGPT]] - [[#other-llm-backends][Other LLM backends]] - [[#azure][Azure]] - [[#gpt4all][GPT4All]] - [[#ollama][Ollama]] - [[#gemini][Gemini]] - [[#llamacpp-or-llamafile][Llama.cpp or Llamafile]] - [[#kagi-fastgpt--summarizer][Kagi (FastGPT & Summarizer)]] - [[#togetherai][together.ai]] - [[#anyscale][Anyscale]] - [[#usage][Usage]] - [[#in-any-buffer][In any buffer:]] - [[#in-a-dedicated-chat-buffer][In a dedicated chat buffer:]] - [[#save-and-restore-your-chat-sessions][Save and restore your chat sessions]] - [[#faq][FAQ]] - [[#i-want-the-window-to-scroll-automatically-as-the-response-is-inserted][I want the window to scroll automatically as the response is inserted]] - [[#i-want-the-cursor-to-move-to-the-next-prompt-after-the-response-is-inserted][I want the cursor to move to the next prompt after the response is inserted]] - [[#i-want-to-change-the-prefix-before-the-prompt-and-response][I want to change the prefix before the prompt and response]] - [[#doom-emacs-sending-a-query-from-the-gptel-menu-fails-because-of-a-key-conflict-with-org-mode][(Doom Emacs) Sending a query from the gptel menu fails because of a key conflict with Org mode]] - [[#why-another-llm-client][Why another LLM client?]] - [[#additional-configuration][Additional Configuration]] - [[#the-gptel-api][The gptel API]] - [[#extensions-using-gptel][Extensions using GPTel]] - [[#alternatives][Alternatives]] - [[#breaking-changes][Breaking Changes]] - [[#acknowledgments][Acknowledgments]] ** Installation GPTel is on MELPA. Ensure that MELPA is in your list of sources, then install gptel with =M-x package-install⏎= =gptel=. (Optional: Install =markdown-mode=.) #+html:
**** Straight #+html: #+begin_src emacs-lisp (straight-use-package 'gptel) #+end_src Installing the =markdown-mode= package is optional. #+html:
#+html:
**** Manual #+html: Clone or download this repository and run =M-x package-install-file⏎= on the repository directory. Installing the =markdown-mode= package is optional. #+html:
#+html:
**** Doom Emacs #+html: In =packages.el= #+begin_src emacs-lisp (package! gptel) #+end_src In =config.el= #+begin_src emacs-lisp (use-package! gptel :config (setq! gptel-api-key "your key")) #+end_src #+html:
#+html:
**** Spacemacs #+html: After installation with =M-x package-install⏎= =gptel= - Add =gptel= to =dotspacemacs-additional-packages= - Add =(require 'gptel)= to =dotspacemacs/user-config= #+html:
** Setup *** ChatGPT Procure an [[https://platform.openai.com/account/api-keys][OpenAI API key]]. Optional: Set =gptel-api-key= to the key. Alternatively, you may choose a more secure method such as: - Storing in =~/.authinfo=. By default, "api.openai.com" is used as HOST and "apikey" as USER. #+begin_src authinfo machine api.openai.com login apikey password TOKEN #+end_src - Setting it to a function that returns the key. *** Other LLM backends #+html:
**** Azure #+html: Register a backend with #+begin_src emacs-lisp (gptel-make-azure "Azure-1" ;Name, whatever you'd like :protocol "https" ;Optional -- https is the default :host "YOUR_RESOURCE_NAME.openai.azure.com" :endpoint "/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2023-05-15" ;or equivalent :stream t ;Enable streaming responses :key #'gptel-api-key :models '("gpt-3.5-turbo" "gpt-4")) #+end_src Refer to the documentation of =gptel-make-azure= to set more parameters. You can pick this backend from the menu when using gptel. (see [[#usage][Usage]]) If you want it to be the default, set it as the default value of =gptel-backend=: #+begin_src emacs-lisp (setq-default gptel-backend (gptel-make-azure "Azure-1" ...) gptel-model "gpt-3.5-turbo") #+end_src #+html:
#+html:
**** GPT4All #+html: Register a backend with #+begin_src emacs-lisp (gptel-make-gpt4all "GPT4All" ;Name of your choosing :protocol "http" :host "localhost:4891" ;Where it's running :models '("mistral-7b-openorca.Q4_0.gguf")) ;Available models #+end_src These are the required parameters, refer to the documentation of =gptel-make-gpt4all= for more. You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=. Additionally you may want to increase the response token size since GPT4All uses very short (often truncated) responses by default: #+begin_src emacs-lisp ;; OPTIONAL configuration (setq-default gptel-model "mistral-7b-openorca.Q4_0.gguf" ;Pick your default model gptel-backend (gptel-make-gpt4all "GPT4All" :protocol ...)) (setq-default gptel-max-tokens 500) #+end_src #+html:
#+html:
**** Ollama #+html: Register a backend with #+begin_src emacs-lisp (gptel-make-ollama "Ollama" ;Any name of your choosing :host "localhost:11434" ;Where it's running :stream t ;Stream responses :models '("mistral:latest")) ;List of models #+end_src These are the required parameters, refer to the documentation of =gptel-make-ollama= for more. You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=: #+begin_src emacs-lisp ;; OPTIONAL configuration (setq-default gptel-model "mistral:latest" ;Pick your default model gptel-backend (gptel-make-ollama "Ollama" :host ...)) #+end_src #+html:
#+html:
**** Gemini #+html: Register a backend with #+begin_src emacs-lisp ;; :key can be a function that returns the API key. (gptel-make-gemini "Gemini" :key "YOUR_GEMINI_API_KEY" :stream t) #+end_src These are the required parameters, refer to the documentation of =gptel-make-gemini= for more. You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=: #+begin_src emacs-lisp ;; OPTIONAL configuration (setq-default gptel-model "gemini-pro" ;Pick your default model gptel-backend (gptel-make-gemini "Gemini" :host ...)) #+end_src #+html:
#+html:
#+html: **** Llama.cpp or Llamafile #+html: (If using a llamafile, run a [[https://github.com/Mozilla-Ocho/llamafile#other-example-llamafiles][server llamafile]] instead of a "command-line llamafile", and a model that supports text generation.) Register a backend with #+begin_src emacs-lisp ;; Llama.cpp offers an OpenAI compatible API (gptel-make-openai "llama-cpp" ;Any name :stream t ;Stream responses :protocol "http" :host "localhost:8000" ;Llama.cpp server location :models '("test")) ;Any names, doesn't matter for Llama #+end_src These are the required parameters, refer to the documentation of =gptel-make-openai= for more. You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=: #+begin_src emacs-lisp ;; OPTIONAL configuration (setq-default gptel-backend (gptel-make-openai "llama-cpp" ...) gptel-model "test") #+end_src #+html:
#+html:
**** Kagi (FastGPT & Summarizer) #+html: Kagi's FastGPT model and the Universal Summarizer are both supported. A couple of notes: 1. Universal Summarizer: If there is a URL at point, the summarizer will summarize the contents of the URL. Otherwise the context sent to the model is the same as always: the buffer text upto point, or the contents of the region if the region is active. 2. Kagi models do not support multi-turn conversations, interactions are "one-shot". They also do not support streaming responses. Register a backend with #+begin_src emacs-lisp (gptel-make-kagi "Kagi" ;any name :key "YOUR_KAGI_API_KEY") ;can be a function that returns the key #+end_src These are the required parameters, refer to the documentation of =gptel-make-kagi= for more. You can pick this backend and the model (fastgpt/summarizer) from the transient menu when using gptel. Alternatively you can set this as the default value of =gptel-backend=: #+begin_src emacs-lisp ;; OPTIONAL configuration (setq-default gptel-model "fastgpt" gptel-backend (gptel-make-kagi "Kagi" :key ...)) #+end_src The alternatives to =fastgpt= include =summarize:cecil=, =summarize:agnes=, =summarize:daphne= and =summarize:muriel=. The difference between the summarizer engines is [[https://help.kagi.com/kagi/api/summarizer.html#summarization-engines][documented here]]. #+html:
#+html:
**** together.ai #+html: Register a backend with #+begin_src emacs-lisp ;; Together.ai offers an OpenAI compatible API (gptel-make-openai "TogetherAI" ;Any name you want :host "api.together.xyz" :key "your-api-key" ;can be a function that returns the key :stream t :models '(;; has many more, check together.ai "mistralai/Mixtral-8x7B-Instruct-v0.1" "codellama/CodeLlama-13b-Instruct-hf" "codellama/CodeLlama-34b-Instruct-hf")) #+end_src You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=: #+begin_src emacs-lisp ;; OPTIONAL configuration (setq-default gptel-backend (gptel-make-openai "TogetherAI" ...) gptel-model "mistralai/Mixtral-8x7B-Instruct-v0.1") #+end_src #+html:
#+html:
**** Anyscale #+html: Register a backend with #+begin_src emacs-lisp ;; Anyscale offers an OpenAI compatible API (gptel-make-openai "Anyscale" ;Any name you want :host "api.endpoints.anyscale.com" :key "your-api-key" ;can be a function that returns the key :models '(;; has many more, check anyscale "mistralai/Mixtral-8x7B-Instruct-v0.1")) #+end_src You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=: #+begin_src emacs-lisp ;; OPTIONAL configuration (setq-default gptel-backend (gptel-make-openai "Anyscale" ...) gptel-model "mistralai/Mixtral-8x7B-Instruct-v0.1") #+end_src #+html:
** Usage (This is also a [[https://www.youtube.com/watch?v=bsRnh_brggM][video demo]] showing various uses of gptel.) |-------------------+-------------------------------------------------------------------------| | *Command* | Description | |-------------------+-------------------------------------------------------------------------| | =gptel-send= | Send conversation up to =(point)=, or selection if region is active. Works anywhere in Emacs. | | =gptel= | Create a new dedicated chat buffer. Not required to use gptel. | | =C-u= =gptel-send= | Transient menu for preferences, input/output redirection etc. | | =gptel-menu= | /(Same)/ | |-------------------+-------------------------------------------------------------------------| | =gptel-set-topic= | /(Org-mode only)/ Limit conversation context to an Org heading | |-------------------+-------------------------------------------------------------------------| *** In any buffer: 1. Call =M-x gptel-send= to send the text up to the cursor. The response will be inserted below. Continue the conversation by typing below the response. 2. If a region is selected, the conversation will be limited to its contents. 3. Call =M-x gptel-send= with a prefix argument to - set chat parameters (GPT model, directives etc) for this buffer, - to read the prompt from elsewhere or redirect the response elsewhere, - or to replace the prompt with the response. [[https://user-images.githubusercontent.com/8607532/230770018-9ce87644-6c17-44af-bd39-8c899303dce1.png]] With a region selected, you can also rewrite prose or refactor code from here: *Code*: [[https://user-images.githubusercontent.com/8607532/230770162-1a5a496c-ee57-4a67-9c95-d45f238544ae.png]] *Prose*: [[https://user-images.githubusercontent.com/8607532/230770352-ee6f45a3-a083-4cf0-b13c-619f7710e9ba.png]] *** In a dedicated chat buffer: 1. Run =M-x gptel= to start or switch to the chat buffer. It will ask you for the key if you skipped the previous step. Run it with a prefix-arg (=C-u M-x gptel=) to start a new session. 2. In the gptel buffer, send your prompt with =M-x gptel-send=, bound to =C-c RET=. 3. Set chat parameters (LLM provider, model, directives etc) for the session by calling =gptel-send= with a prefix argument (=C-u C-c RET=): [[https://user-images.githubusercontent.com/8607532/224946059-9b918810-ab8b-46a6-b917-549d50c908f2.png]] That's it. You can go back and edit previous prompts and responses if you want. The default mode is =markdown-mode= if available, else =text-mode=. You can set =gptel-default-mode= to =org-mode= if desired. **** Save and restore your chat sessions Saving the file will save the state of the conversation as well. To resume the chat, open the file and turn on =gptel-mode= before editing the buffer. ** FAQ #+html:
**** I want the window to scroll automatically as the response is inserted #+html: To be minimally annoying, GPTel does not move the cursor by default. Add the following to your configuration to enable auto-scrolling. #+begin_src emacs-lisp (add-hook 'gptel-post-stream-hook 'gptel-auto-scroll) #+end_src #+html:
#+html:
**** I want the cursor to move to the next prompt after the response is inserted #+html: To be minimally annoying, GPTel does not move the cursor by default. Add the following to your configuration to move the cursor: #+begin_src emacs-lisp (add-hook 'gptel-post-response-functions 'gptel-end-of-response) #+end_src You can also call =gptel-end-of-response= as a command at any time. #+html:
#+html:
**** I want to change the prefix before the prompt and response #+html: Customize =gptel-prompt-prefix-alist= and =gptel-response-prefix-alist=. You can set a different pair for each major-mode. #+html:
#+html:
**** (Doom Emacs) Sending a query from the gptel menu fails because of a key conflict with Org mode #+html: Doom binds ~RET~ in Org mode to =+org/dwim-at-point=, which appears to conflict with gptel's transient menu bindings for some reason. Two solutions: - Press ~C-m~ instead of the return key. - Change the send key from return to a key of your choice: #+begin_src emacs-lisp (transient-suffix-put 'gptel-menu (kbd "RET") :key "") #+end_src #+html:
#+html:
**** Why another LLM client? #+html: Other Emacs clients for LLMs prescribe the format of the interaction (a comint shell, org-babel blocks, etc). I wanted: 1. Something that is as free-form as possible: query the model using any text in any buffer, and redirect the response as required. Using a dedicated =gptel= buffer just adds some visual flair to the interaction. 2. Integration with org-mode, not using a walled-off org-babel block, but as regular text. This way the model can generate code blocks that I can run. #+html:
** Additional Configuration :PROPERTIES: :ID: f885adac-58a3-4eba-a6b7-91e9e7a17829 :END: #+begin_src emacs-lisp :exports none :results list (let ((all)) (mapatoms (lambda (sym) (when (and (string-match-p "^gptel-[^-]" (symbol-name sym)) (get sym 'variable-documentation)) (push sym all)))) all) #+end_src |---------------------------+---------------------------------------------------------------------| | *Connection options* | | |---------------------------+---------------------------------------------------------------------| | =gptel-use-curl= | Use Curl (default), fallback to Emacs' built-in =url=. | | =gptel-proxy= | Proxy server for requests, passed to curl via =--proxy=. | | =gptel-api-key= | Variable/function that returns the API key for the active backend. | |---------------------------+---------------------------------------------------------------------| |-------------------+---------------------------------------------------------| | *LLM options* | /(Note: not supported uniformly across LLMs)/ | |-------------------+---------------------------------------------------------| | =gptel-backend= | Default LLM Backend. | | =gptel-model= | Default model to use, depends on the backend. | | =gptel-stream= | Enable streaming responses, if the backend supports it. | | =gptel-directives= | Alist of system directives, can switch on the fly. | | =gptel-max-tokens= | Maximum token count (in query + response). | | =gptel-temperature= | Randomness in response text, 0 to 2. | |-------------------+---------------------------------------------------------| |-----------------------------+----------------------------------------| | *Chat UI options* | | |-----------------------------+----------------------------------------| | =gptel-default-mode= | Major mode for dedicated chat buffers. | | =gptel-prompt-prefix-alist= | Text inserted before queries. | | =gptel-response-prefix-alist= | Text inserted before responses. | | =gptel-use-header-line= | Display status messages in header-line (default) or minibuffer | |-----------------------------+----------------------------------------| ** COMMENT Will you add feature X? Maybe, I'd like to experiment a bit more first. Features added since the inception of this package include - Curl support (=gptel-use-curl=) - Streaming responses (=gptel-stream=) - Cancelling requests in progress (=gptel-abort=) - General API for writing your own commands (=gptel-request=, [[https://github.com/karthink/gptel/wiki][wiki]]) - Dispatch menus using Transient (=gptel-send= with a prefix arg) - Specifying the conversation context size - GPT-4 support - Response redirection (to the echo area, another buffer, etc) - A built-in refactor/rewrite prompt - Limiting conversation context to Org headings using properties (#58) - Saving and restoring chats (#17) - Support for local LLMs. Features being considered or in the pipeline: - Fully stateless design (#17) ** The gptel API GPTel's default usage pattern is simple, and will stay this way: Read input in any buffer and insert the response below it. Some custom behavior is possible with the transient menu (=C-u M-x gptel-send=). For more programmable usage, gptel provides a general =gptel-request= function that accepts a custom prompt and a callback to act on the response. You can use this to build custom workflows not supported by =gptel-send=. See the documentation of =gptel-request=, and the [[https://github.com/karthink/gptel/wiki][wiki]] for examples. *** Extensions using GPTel These are packages that depend on GPTel to provide additional functionality - [[https://github.com/kamushadenes/gptel-extensions.el][gptel-extensions]]: Extra utility functions for GPTel. - [[https://github.com/kamushadenes/ai-blog.el][ai-blog.el]]: Streamline generation of blog posts in Hugo. ** Alternatives Other Emacs clients for LLMs include - [[https://github.com/xenodium/chatgpt-shell][chatgpt-shell]]: comint-shell based interaction with ChatGPT. Also supports DALL-E, executable code blocks in the responses, and more. - [[https://github.com/rksm/org-ai][org-ai]]: Interaction through special =#+begin_ai ... #+end_ai= Org-mode blocks. Also supports DALL-E, querying ChatGPT with the contents of project files, and more. There are several more: [[https://github.com/CarlQLange/chatgpt-arcana.el][chatgpt-arcana]], [[https://github.com/MichaelBurge/leafy-mode][leafy-mode]], [[https://github.com/iwahbe/chat.el][chat.el]] ** Breaking Changes - =gptel-post-response-hook= has been renamed to =gptel-post-response-functions=, and functions in this hook are now called with two arguments: the start and end buffer positions of the response. This should make it easy to act on the response text without having to locate it first. - Possible breakage, see #120: If streaming responses stop working for you after upgrading to v0.5, try reinstalling gptel and deleting its native comp eln cache in =native-comp-eln-load-path=. - The user option =gptel-host= is deprecated. If the defaults don't work for you, use =gptel-make-openai= (which see) to customize server settings. - =gptel-api-key-from-auth-source= now searches for the API key using the host address for the active LLM backend, /i.e./ "api.openai.com" when using ChatGPT. You may need to update your =~/.authinfo=. ** Acknowledgments - [[https://github.com/algal][Alexis Gallagher]] and [[https://github.com/d1egoaz][Diego Alvarez]] for fixing a nasty multi-byte bug with =url-retrieve=. - [[https://github.com/tarsius][Jonas Bernoulli]] for the Transient library. # Local Variables: # toc-org-max-depth: 4 # eval: (and (fboundp 'toc-org-mode) (toc-org-mode 1)) # End: