#+title: GPTel: A simple LLM client for Emacs

[[https://melpa.org/#/gptel][file:https://melpa.org/packages/gptel-badge.svg]]

GPTel is a simple Large Language Model chat client for Emacs, with support for multiple models and backends.

| LLM Backend     | Supports | Requires                  |
|-----------------+----------+---------------------------|
| ChatGPT         | ✓      | [[https://platform.openai.com/account/api-keys][API key]]                   |
| Azure           | ✓      | Deployment and API key    |
| Ollama          | ✓      | [[https://ollama.ai/][Ollama running locally]]    |
| GPT4All         | ✓      | [[https://gpt4all.io/index.html][GPT4All running locally]]   |
| Gemini          | ✓      | [[https://makersuite.google.com/app/apikey][API key]]                   |
| Llama.cpp       | ✓      | [[https://github.com/ggerganov/llama.cpp/tree/master/examples/server#quick-start][Llama.cpp running locally]] |
| Llamafile       | ✓      | [[https://github.com/Mozilla-Ocho/llamafile#quickstart][Local Llamafile server]]    |
| Kagi FastGPT    | ✓      | [[https://kagi.com/settings?p=api][API key]]                   |
| Kagi Summarizer | ✓      | [[https://kagi.com/settings?p=api][API key]]                   |
| together.ai     | ✓      | [[https://api.together.xyz/settings/api-keys][API key]]                   |
| Anyscale        | ✓      | [[https://docs.endpoints.anyscale.com/][API key]]                   |

*General usage*: ([[https://www.youtube.com/watch?v=bsRnh_brggM][YouTube Demo]])

https://user-images.githubusercontent.com/8607532/230516812-86510a09-a2fb-4cbd-b53f-cc2522d05a13.mp4

https://user-images.githubusercontent.com/8607532/230516816-ae4a613a-4d01-4073-ad3f-b66fa73c6e45.mp4

*Multi-LLM support demo*:

https://github-production-user-asset-6210df.s3.amazonaws.com/8607532/278854024-ae1336c4-5b87-41f2-83e9-e415349d6a43.mp4

- It's async and fast, streams responses.
- Interact with LLMs from anywhere in Emacs (any buffer, shell, minibuffer, wherever)
- LLM responses are in Markdown or Org markup.
- Supports conversations and multiple independent sessions.
- Save chats as regular Markdown/Org/Text files and resume them later.
- You can go back and edit your previous prompts or LLM responses when continuing a conversation. These will be fed back to the model.
- Don't like gptel's workflow? Use it to create your own for any supported model/backend with a [[https://github.com/karthink/gptel/wiki#defining-custom-gptel-commands][simple API]].

GPTel uses Curl if available, but falls back to url-retrieve to work without external dependencies.

** Contents :toc:
  - [[#installation][Installation]]
      - [[#straight][Straight]]
      - [[#manual][Manual]]
      - [[#doom-emacs][Doom Emacs]]
      - [[#spacemacs][Spacemacs]]
  - [[#setup][Setup]]
    - [[#chatgpt][ChatGPT]]
    - [[#other-llm-backends][Other LLM backends]]
      - [[#azure][Azure]]
      - [[#gpt4all][GPT4All]]
      - [[#ollama][Ollama]]
      - [[#gemini][Gemini]]
      - [[#llamacpp-or-llamafile][Llama.cpp or Llamafile]]
      - [[#kagi-fastgpt--summarizer][Kagi (FastGPT & Summarizer)]]
      - [[#togetherai][together.ai]]
      - [[#anyscale][Anyscale]]
  - [[#usage][Usage]]
    - [[#in-any-buffer][In any buffer:]]
    - [[#in-a-dedicated-chat-buffer][In a dedicated chat buffer:]]
      - [[#save-and-restore-your-chat-sessions][Save and restore your chat sessions]]
  - [[#faq][FAQ]]
      - [[#i-want-the-window-to-scroll-automatically-as-the-response-is-inserted][I want the window to scroll automatically as the response is inserted]]
      - [[#i-want-the-cursor-to-move-to-the-next-prompt-after-the-response-is-inserted][I want the cursor to move to the next prompt after the response is inserted]]
      - [[#i-want-to-change-the-prefix-before-the-prompt-and-response][I want to change the prefix before the prompt and response]]
      - [[#doom-emacs-sending-a-query-from-the-gptel-menu-fails-because-of-a-key-conflict-with-org-mode][(Doom Emacs) Sending a query from the gptel menu fails because of a key conflict with Org mode]]
      - [[#why-another-llm-client][Why another LLM client?]]
  - [[#additional-configuration][Additional Configuration]]
  - [[#the-gptel-api][The gptel API]]
    - [[#extensions-using-gptel][Extensions using GPTel]]
  - [[#alternatives][Alternatives]]
  - [[#breaking-changes][Breaking Changes]]
  - [[#acknowledgments][Acknowledgments]]

** Installation

GPTel is on MELPA. Ensure that MELPA is in your list of sources, then install gptel with =M-x package-install⏎= =gptel=.

(Optional: Install =markdown-mode=.)

#+html: <details><summary>
**** Straight
#+html: </summary>
#+begin_src emacs-lisp
  (straight-use-package 'gptel)
#+end_src

Installing the =markdown-mode= package is optional.
#+html: </details>
#+html: <details><summary>
**** Manual
#+html: </summary>
Clone or download this repository and run =M-x package-install-file⏎= on the repository directory.

Installing the =markdown-mode= package is optional.
#+html: </details>
#+html: <details><summary>
**** Doom Emacs
#+html: </summary>
In =packages.el=
#+begin_src emacs-lisp
(package! gptel)
#+end_src

In =config.el=
#+begin_src emacs-lisp
(use-package! gptel
 :config
 (setq! gptel-api-key "your key"))
#+end_src
#+html: </details>
#+html: <details><summary>
**** Spacemacs
#+html: </summary>
After installation with =M-x package-install⏎= =gptel=

- Add =gptel= to =dotspacemacs-additional-packages=
- Add =(require 'gptel)= to =dotspacemacs/user-config=
#+html: </details>
** Setup
*** ChatGPT
Procure an [[https://platform.openai.com/account/api-keys][OpenAI API key]].

Optional: Set =gptel-api-key= to the key. Alternatively, you may choose a more secure method such as:

- Storing in =~/.authinfo=. By default, "api.openai.com" is used as HOST and "apikey" as USER.
  #+begin_src authinfo
machine api.openai.com login apikey password TOKEN
  #+end_src
- Setting it to a function that returns the key.

*** Other LLM backends
#+html: <details><summary>
**** Azure
#+html: </summary>

Register a backend with
#+begin_src emacs-lisp
(gptel-make-azure "Azure-1"             ;Name, whatever you'd like
  :protocol "https"                     ;Optional -- https is the default
  :host "YOUR_RESOURCE_NAME.openai.azure.com"
  :endpoint "/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2023-05-15" ;or equivalent
  :stream t                             ;Enable streaming responses
  :key #'gptel-api-key
  :models '("gpt-3.5-turbo" "gpt-4"))
#+end_src
Refer to the documentation of =gptel-make-azure= to set more parameters.

You can pick this backend from the menu when using gptel. (see [[#usage][Usage]])

If you want it to be the default, set it as the default value of =gptel-backend=:
#+begin_src emacs-lisp
(setq-default gptel-backend (gptel-make-azure "Azure-1" ...)
              gptel-model "gpt-3.5-turbo")
#+end_src
#+html: </details>

#+html: <details><summary>
**** GPT4All
#+html: </summary>

Register a backend with
#+begin_src emacs-lisp
(gptel-make-gpt4all "GPT4All"           ;Name of your choosing
 :protocol "http"                       
 :host "localhost:4891"                 ;Where it's running
 :models '("mistral-7b-openorca.Q4_0.gguf")) ;Available models
#+end_src
These are the required parameters, refer to the documentation of =gptel-make-gpt4all= for more.

You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=.  Additionally you may want to increase the response token size since GPT4All uses very short (often truncated) responses by default:

#+begin_src emacs-lisp
;; OPTIONAL configuration
(setq-default gptel-model "mistral-7b-openorca.Q4_0.gguf" ;Pick your default model
              gptel-backend (gptel-make-gpt4all "GPT4All" :protocol ...))
(setq-default gptel-max-tokens 500)
#+end_src

#+html: </details>

#+html: <details><summary>
**** Ollama
#+html: </summary>

Register a backend with
#+begin_src emacs-lisp
(gptel-make-ollama "Ollama"             ;Any name of your choosing
  :host "localhost:11434"               ;Where it's running
  :stream t                             ;Stream responses
  :models '("mistral:latest"))          ;List of models
#+end_src
These are the required parameters, refer to the documentation of =gptel-make-ollama= for more.

You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=:

#+begin_src emacs-lisp
;; OPTIONAL configuration
(setq-default gptel-model "mistral:latest" ;Pick your default model
              gptel-backend (gptel-make-ollama "Ollama" :host ...))
#+end_src

#+html: </details>

#+html: <details><summary>
**** Gemini
#+html: </summary>

Register a backend with
#+begin_src emacs-lisp
;; :key can be a function that returns the API key.
(gptel-make-gemini "Gemini"
  :key "YOUR_GEMINI_API_KEY"
  :stream t)
#+end_src
These are the required parameters, refer to the documentation of =gptel-make-gemini= for more.

You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=:

#+begin_src emacs-lisp
;; OPTIONAL configuration
(setq-default gptel-model "gemini-pro" ;Pick your default model
              gptel-backend (gptel-make-gemini "Gemini" :host ...))
#+end_src

#+html: </details>

#+html: <details>
#+html: <summary>
**** Llama.cpp or Llamafile
#+html: </summary>

(If using a llamafile, run a [[https://github.com/Mozilla-Ocho/llamafile#other-example-llamafiles][server llamafile]] instead of a "command-line llamafile", and a model that supports text generation.)

Register a backend with
#+begin_src emacs-lisp
;; Llama.cpp offers an OpenAI compatible API
(gptel-make-openai "llama-cpp"          ;Any name
  :stream t                             ;Stream responses
  :protocol "http"
  :host "localhost:8000"                ;Llama.cpp server location
  :models '("test"))                    ;Any names, doesn't matter for Llama
#+end_src
These are the required parameters, refer to the documentation of =gptel-make-openai= for more.

You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=:
#+begin_src emacs-lisp
;; OPTIONAL configuration
(setq-default gptel-backend (gptel-make-openai "llama-cpp" ...)
              gptel-model   "test")
#+end_src

#+html: </details>
#+html: <details><summary>
**** Kagi (FastGPT & Summarizer)
#+html: </summary>

Kagi's FastGPT model and the Universal Summarizer are both supported.  A couple of notes:

1. Universal Summarizer: If there is a URL at point, the summarizer will summarize the contents of the URL.  Otherwise the context sent to the model is the same as always: the buffer text upto point, or the contents of the region if the region is active.

2. Kagi models do not support multi-turn conversations, interactions are "one-shot".  They also do not support streaming responses.

Register a backend with
#+begin_src emacs-lisp
(gptel-make-kagi "Kagi"                    ;any name
  :key "YOUR_KAGI_API_KEY")                ;can be a function that returns the key
#+end_src
These are the required parameters, refer to the documentation of =gptel-make-kagi= for more.

You can pick this backend and the model (fastgpt/summarizer) from the transient menu when using gptel.  Alternatively you can set this as the default value of =gptel-backend=:

#+begin_src emacs-lisp
;; OPTIONAL configuration
(setq-default gptel-model "fastgpt"
              gptel-backend (gptel-make-kagi "Kagi" :key ...))
#+end_src

The alternatives to =fastgpt= include =summarize:cecil=, =summarize:agnes=, =summarize:daphne= and =summarize:muriel=.  The difference between the summarizer engines is [[https://help.kagi.com/kagi/api/summarizer.html#summarization-engines][documented here]].

#+html: </details>
#+html: <details><summary>
**** together.ai
#+html: </summary>

Register a backend with
#+begin_src emacs-lisp
;; Together.ai offers an OpenAI compatible API
(gptel-make-openai "TogetherAI"         ;Any name you want
  :host "api.together.xyz"
  :key "your-api-key"                   ;can be a function that returns the key
  :stream t
  :models '(;; has many more, check together.ai
            "mistralai/Mixtral-8x7B-Instruct-v0.1"
            "codellama/CodeLlama-13b-Instruct-hf"
            "codellama/CodeLlama-34b-Instruct-hf"))
#+end_src

You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=:
#+begin_src emacs-lisp
;; OPTIONAL configuration
(setq-default gptel-backend (gptel-make-openai "TogetherAI" ...)
              gptel-model   "mistralai/Mixtral-8x7B-Instruct-v0.1")
#+end_src

#+html: </details>
#+html: <details><summary>
**** Anyscale
#+html: </summary>

Register a backend with
#+begin_src emacs-lisp
;; Anyscale offers an OpenAI compatible API
(gptel-make-openai "Anyscale"           ;Any name you want
  :host "api.endpoints.anyscale.com"
  :key "your-api-key"                   ;can be a function that returns the key
  :models '(;; has many more, check anyscale
            "mistralai/Mixtral-8x7B-Instruct-v0.1"))
#+end_src

You can pick this backend from the menu when using gptel (see [[#usage][Usage]]), or set this as the default value of =gptel-backend=:
#+begin_src emacs-lisp
;; OPTIONAL configuration
(setq-default gptel-backend (gptel-make-openai "Anyscale" ...)
              gptel-model   "mistralai/Mixtral-8x7B-Instruct-v0.1")
#+end_src

#+html: </details>

** Usage

(This is also a [[https://www.youtube.com/watch?v=bsRnh_brggM][video demo]] showing various uses of gptel.)

|-------------------+-------------------------------------------------------------------------|
| *Command*          | Description                                                             |
|-------------------+-------------------------------------------------------------------------|
| =gptel-send=        | Send conversation up to =(point)=, or selection if region is active.  Works anywhere in Emacs. |
| =gptel=             | Create a new dedicated chat buffer.  Not required to use gptel. |
| =C-u= =gptel-send=    | Transient menu for preferences, input/output redirection etc.            |
| =gptel-menu=        | /(Same)/                                                                  |
|-------------------+-------------------------------------------------------------------------|
| =gptel-set-topic= | /(Org-mode only)/ Limit conversation context to an Org heading             |
|-------------------+-------------------------------------------------------------------------|

*** In any buffer:

1. Call =M-x gptel-send= to send the text up to the cursor. The response will be inserted below.  Continue the conversation by typing below the response.

2. If a region is selected, the conversation will be limited to its contents.

3. Call =M-x gptel-send= with a prefix argument to
- set chat parameters (GPT model, directives etc) for this buffer,
- to read the prompt from elsewhere or redirect the response elsewhere,
- or to replace the prompt with the response.

[[https://user-images.githubusercontent.com/8607532/230770018-9ce87644-6c17-44af-bd39-8c899303dce1.png]]

With a region selected, you can also rewrite prose or refactor code from here:

*Code*:

[[https://user-images.githubusercontent.com/8607532/230770162-1a5a496c-ee57-4a67-9c95-d45f238544ae.png]]

*Prose*:

[[https://user-images.githubusercontent.com/8607532/230770352-ee6f45a3-a083-4cf0-b13c-619f7710e9ba.png]]

*** In a dedicated chat buffer:

1. Run =M-x gptel= to start or switch to the chat buffer. It will ask you for the key if you skipped the previous step. Run it with a prefix-arg (=C-u M-x gptel=) to start a new session.

2. In the gptel buffer, send your prompt with =M-x gptel-send=, bound to =C-c RET=.

3. Set chat parameters (LLM provider, model, directives etc) for the session by calling =gptel-send= with a prefix argument (=C-u C-c RET=):

[[https://user-images.githubusercontent.com/8607532/224946059-9b918810-ab8b-46a6-b917-549d50c908f2.png]]

That's it. You can go back and edit previous prompts and responses if you want.

The default mode is =markdown-mode= if available, else =text-mode=.  You can set =gptel-default-mode= to =org-mode= if desired.

**** Save and restore your chat sessions

Saving the file will save the state of the conversation as well.  To resume the chat, open the file and turn on =gptel-mode= before editing the buffer.  

** FAQ
#+html: <details><summary>
**** I want the window to scroll automatically as the response is inserted
#+html: </summary>

To be minimally annoying, GPTel does not move the cursor by default.  Add the following to your configuration to enable auto-scrolling.

#+begin_src emacs-lisp
(add-hook 'gptel-post-stream-hook 'gptel-auto-scroll)
#+end_src

#+html: </details>
#+html: <details><summary>
**** I want the cursor to move to the next prompt after the response is inserted
#+html: </summary>

To be minimally annoying, GPTel does not move the cursor by default.  Add the following to your configuration to move the cursor:

#+begin_src emacs-lisp
(add-hook 'gptel-post-response-functions 'gptel-end-of-response)
#+end_src

You can also call =gptel-end-of-response= as a command at any time.


#+html: </details>
#+html: <details><summary>
**** I want to change the prefix before the prompt and response
#+html: </summary>

Customize =gptel-prompt-prefix-alist= and =gptel-response-prefix-alist=.  You can set a different pair for each major-mode.


#+html: </details>

#+html: <details><summary>
**** (Doom Emacs) Sending a query from the gptel menu fails because of a key conflict with Org mode 
#+html: </summary>

Doom binds ~RET~ in Org mode to =+org/dwim-at-point=, which appears to conflict with gptel's transient menu bindings for some reason.

Two solutions:
- Press ~C-m~ instead of the return key.
- Change the send key from return to a key of your choice:
  #+begin_src emacs-lisp
  (transient-suffix-put 'gptel-menu (kbd "RET") :key "<f8>")
  #+end_src

#+html: </details>
#+html: <details><summary>
**** Why another LLM client?
#+html: </summary>

Other Emacs clients for LLMs prescribe the format of the interaction (a comint shell, org-babel blocks, etc).  I wanted:

1. Something that is as free-form as possible: query the model using any text in any buffer, and redirect the response as required.  Using a dedicated =gptel= buffer just adds some visual flair to the interaction.
2. Integration with org-mode, not using a walled-off org-babel block, but as regular text.  This way the model can generate code blocks that I can run.

#+html: </details>

** Additional Configuration
:PROPERTIES:
:ID:       f885adac-58a3-4eba-a6b7-91e9e7a17829
:END:

#+begin_src emacs-lisp :exports none :results list
(let ((all))
  (mapatoms (lambda (sym)
              (when (and (string-match-p "^gptel-[^-]" (symbol-name sym))
                         (get sym 'variable-documentation))
                (push sym all))))
  all)
#+end_src

|---------------------------+---------------------------------------------------------------------|
| *Connection options*        |                                                                     |
|---------------------------+---------------------------------------------------------------------|
| =gptel-use-curl=            | Use Curl (default), fallback to Emacs' built-in =url=.                |
| =gptel-proxy=               | Proxy server for requests, passed to curl via =--proxy=.              |
| =gptel-api-key=             | Variable/function that returns the API key for the active backend.  |
|---------------------------+---------------------------------------------------------------------|

|-------------------+---------------------------------------------------------|
| *LLM options*       | /(Note: not supported uniformly across LLMs)/             |
|-------------------+---------------------------------------------------------|
| =gptel-backend=     | Default LLM Backend.                                    |
| =gptel-model=       | Default model to use, depends on the backend.           |
| =gptel-stream=      | Enable streaming responses, if the backend supports it. |
| =gptel-directives=  | Alist of system directives, can switch on the fly.      |
| =gptel-max-tokens=  | Maximum token count (in query + response).              |
| =gptel-temperature= | Randomness in response text, 0 to 2.                    |
|-------------------+---------------------------------------------------------|

|-----------------------------+----------------------------------------|
| *Chat UI options*             |                                        |
|-----------------------------+----------------------------------------|
| =gptel-default-mode=          | Major mode for dedicated chat buffers. |
| =gptel-prompt-prefix-alist=   | Text inserted before queries.          |
| =gptel-response-prefix-alist= | Text inserted before responses.        |
| =gptel-use-header-line= | Display status messages in header-line (default) or minibuffer      |
|-----------------------------+----------------------------------------|

** COMMENT Will you add feature X?

Maybe, I'd like to experiment a bit more first.  Features added since the inception of this package include
- Curl support (=gptel-use-curl=)
- Streaming responses (=gptel-stream=)
- Cancelling requests in progress (=gptel-abort=)
- General API for writing your own commands (=gptel-request=, [[https://github.com/karthink/gptel/wiki][wiki]])
- Dispatch menus using Transient (=gptel-send= with a prefix arg)
- Specifying the conversation context size
- GPT-4 support
- Response redirection (to the echo area, another buffer, etc)
- A built-in refactor/rewrite prompt
- Limiting conversation context to Org headings using properties (#58)
- Saving and restoring chats (#17)
- Support for local LLMs.

Features being considered or in the pipeline:
- Fully stateless design (#17)

** The gptel API

GPTel's default usage pattern is simple, and will stay this way: Read input in any buffer and insert the response below it.  Some custom behavior is possible with the transient menu (=C-u M-x gptel-send=).

For more programmable usage, gptel provides a general =gptel-request= function that accepts a custom prompt and a callback to act on the response. You can use this to build custom workflows not supported by =gptel-send=.  See the documentation of =gptel-request=, and the [[https://github.com/karthink/gptel/wiki][wiki]] for examples.

*** Extensions using GPTel

These are packages that depend on GPTel to provide additional functionality

- [[https://github.com/kamushadenes/gptel-extensions.el][gptel-extensions]]: Extra utility functions for GPTel.
- [[https://github.com/kamushadenes/ai-blog.el][ai-blog.el]]: Streamline generation of blog posts in Hugo.

** Alternatives

Other Emacs clients for LLMs include

- [[https://github.com/xenodium/chatgpt-shell][chatgpt-shell]]: comint-shell based interaction with ChatGPT.  Also supports DALL-E, executable code blocks in the responses, and more.
- [[https://github.com/rksm/org-ai][org-ai]]: Interaction through special =#+begin_ai ... #+end_ai= Org-mode blocks.  Also supports DALL-E, querying ChatGPT with the contents of project files, and more.

There are several more: [[https://github.com/CarlQLange/chatgpt-arcana.el][chatgpt-arcana]], [[https://github.com/MichaelBurge/leafy-mode][leafy-mode]], [[https://github.com/iwahbe/chat.el][chat.el]]

** Breaking Changes

- =gptel-post-response-hook= has been renamed to =gptel-post-response-functions=, and functions in this hook are now called with two arguments: the start and end buffer positions of the response.  This should make it easy to act on the response text without having to locate it first.

- Possible breakage, see #120: If streaming responses stop working for you after upgrading to v0.5, try reinstalling gptel and deleting its native comp eln cache in =native-comp-eln-load-path=.

- The user option =gptel-host= is deprecated.  If the defaults don't work for you, use =gptel-make-openai= (which see) to customize server settings.

- =gptel-api-key-from-auth-source= now searches for the API key using the host address for the active LLM backend, /i.e./ "api.openai.com" when using ChatGPT.  You may need to update your =~/.authinfo=.

** Acknowledgments

- [[https://github.com/algal][Alexis Gallagher]] and [[https://github.com/d1egoaz][Diego Alvarez]] for fixing a nasty multi-byte bug with =url-retrieve=.
- [[https://github.com/tarsius][Jonas Bernoulli]] for the Transient library.


# Local Variables:
# toc-org-max-depth: 4
# eval: (and (fboundp 'toc-org-mode) (toc-org-mode 1))
# End: