gptel: Add multi-llm support

README.org: Update README with new information and a multi-llm demo. gptel.el (gptel-host, gptel--known-backends, gptel--api-key, gptel--create-prompt, gptel--request-data, gptel--parse-buffer, gptel-request, gptel--parse-response, gptel--openai, gptel--debug, gptel--restore-state, gptel, gptel-backend): Integrate multiple LLMs through the introcution of gptel-backends. Each backend is composed of two pieces: 1. An instance of a cl-struct, containing connection, authentication and model information. See the cl-struct `gptel-backend` for details. A separate cl-struct type is defined for each supported backend (OpenAI, Azure, GPT4All and Ollama) that inherits from the generic gptel-backend type. 2. cl-generic implementations of specific tasks, like gathering up and formatting context (previous user queries and LLM responses), parsing responses or responses streams etc. The four tasks currently specialized this way are carried out by `gptel--parse-buffer` and `gptel--request-data` (for constructing the query) and `gptel--parse-response` and `gptel-curl--parse-stream` (for parsing the response). See their implementations for details. Some effort has been made to limit the number of times dispatching is done when reading streaming responses. When a backend is created, it is registered in the collection `gptel--known-backends` and can be accessed by name later, such as from the transient menu. Only one of these backends is active at any time in a buffer, stored in the buffer-local variable `gptel-backend`. Most messaging, authentication etc accounts for the active backend, although there might be some leftovers. When using `gptel-request` or `gptel-send`, the active backend can be changed or let-bound. - Obsolete `gptel-host` - Fix the rear-sticky property when restoring sessions from files. - Document some variables (not user options), like `gptel--debug` gptel-openai.el (gptel-backend, gptel-make-openai, gptel-make-azure, gptel-make-gpt4all): This file (currently always loaded) sets up the generic backend struct and includes constructors for creating OpenAI, GPT4All and Azure backends. They all use the same API so a single set of defgeneric implemenations suffices for all of them. gptel-ollama.el (gptel-make-ollama): This file includes the cl-struct, constructor and requisite defgeneric implementations for Ollama support. gptel-transient.el (gptel-menu, gptel-provider-variable, gptel--infix-provider, gptel-suffix-send): - Provide access to all available LLM backends and models from `gptel-menu`. - Adjust keybindings in gptel-menu: setting the model and query parameters is now bound to two char keybinds, while redirecting input and output is bound to single keys.
2023-10-23 23:40:52 -07:00 · 2023-10-23 23:40:52 -07:00 · 6419e8f021
commit 6419e8f021
parent 61c0df5e19
6 changed files with 717 additions and 165 deletions
--- a/README.org
+++ b/README.org
@ -1,20 +1,34 @@
-#+title: GPTel: A simple ChatGPT client for Emacs
+#+title: GPTel: A simple LLM client for Emacs

 [[https://melpa.org/#/gptel][file:https://melpa.org/packages/gptel-badge.svg]]

-GPTel is a simple, no-frills ChatGPT client for Emacs.
+GPTel is a simple Large Language Model chat client for Emacs, with support for multiple models/backends.
+
+| LLM Backend | Supports | Requires               |
+|-------------+----------+------------------------|
+| ChatGPT     | ✓       | [[https://platform.openai.com/account/api-keys][API key]]                |
+| Azure       | ✓       | Deployment and API key |
+| Ollama      | ✓       | An LLM running locally |
+| GPT4All     | ✓       | An LLM running locally |
+| PrivateGPT  | Planned  | -                      |
+| Llama.cpp   | Planned  | -                      |
+
+*General usage*:

 https://user-images.githubusercontent.com/8607532/230516812-86510a09-a2fb-4cbd-b53f-cc2522d05a13.mp4

 https://user-images.githubusercontent.com/8607532/230516816-ae4a613a-4d01-4073-ad3f-b66fa73c6e45.mp4

- Requires an [[https://platform.openai.com/account/api-keys][OpenAI API key]].
+*Multi-LLM support demo*:
+
+https://github-production-user-asset-6210df.s3.amazonaws.com/8607532/278854024-ae1336c4-5b87-41f2-83e9-e415349d6a43.mp4
+
 - It's async and fast, streams responses.
- Interact with ChatGPT from anywhere in Emacs (any buffer, shell, minibuffer, wherever)
- ChatGPT's responses are in Markdown or Org markup.
+- Interact with LLMs from anywhere in Emacs (any buffer, shell, minibuffer, wherever)
+- LLM responses are in Markdown or Org markup.
 - Supports conversations and multiple independent sessions.
 - Save chats as regular Markdown/Org/Text files and resume them later.
- You can go back and edit your previous prompts, or even ChatGPT's previous responses when continuing a conversation. These will be fed back to ChatGPT.
+- You can go back and edit your previous prompts or LLM responses when continuing a conversation. These will be fed back to the model.

 GPTel uses Curl if available, but falls back to url-retrieve to work without external dependencies.

@ -25,13 +39,20 @@ GPTel uses Curl if available, but falls back to url-retrieve to work without ext
      - [[#manual][Manual]]
      - [[#doom-emacs][Doom Emacs]]
      - [[#spacemacs][Spacemacs]]
+  - [[#setup][Setup]]
+    - [[#chatgpt][ChatGPT]]
+    - [[#other-llm-backends][Other LLM backends]]
+      - [[#azure][Azure]]
+      - [[#gpt4all][GPT4All]]
+      - [[#ollama][Ollama]]
  - [[#usage][Usage]]
    - [[#in-any-buffer][In any buffer:]]
    - [[#in-a-dedicated-chat-buffer][In a dedicated chat buffer:]]
+      - [[#save-and-restore-your-chat-sessions][Save and restore your chat sessions]]
  - [[#using-it-your-way][Using it your way]]
    - [[#extensions-using-gptel][Extensions using GPTel]]
  - [[#additional-configuration][Additional Configuration]]
-  - [[#why-another-chatgpt-client][Why another ChatGPT client?]]
+  - [[#why-another-llm-client][Why another LLM client?]]
  - [[#will-you-add-feature-x][Will you add feature X?]]
  - [[#alternatives][Alternatives]]
  - [[#acknowledgments][Acknowledgments]]
@ -41,7 +62,7 @@ GPTel uses Curl if available, but falls back to url-retrieve to work without ext

 ** Installation

-GPTel is on MELPA. Install it with =M-x package-install⏎= =gptel=.
+GPTel is on MELPA. Ensure that MELPA is in your list of sources, then install gptel with =M-x package-install⏎= =gptel=.

 (Optional: Install =markdown-mode=.)

@ -84,9 +105,8 @@ After installation with =M-x package-install⏎= =gptel=
 - Add =gptel= to =dotspacemacs-additional-packages=
 - Add =(require 'gptel)= to =dotspacemacs/user-config=
 #+html: </details>
-
-** Usage
-
+** Setup
+*** ChatGPT
 Procure an [[https://platform.openai.com/account/api-keys][OpenAI API key]].

 Optional: Set =gptel-api-key= to the key. Alternatively, you may choose a more secure method such as:
@ -97,6 +117,72 @@ machine api.openai.com login apikey password TOKEN
  #+end_src
 - Setting it to a function that returns the key.

+*** Other LLM backends
+#+html: <details><summary>
+**** Azure
+#+html: </summary>
+
+Register a backend with
+#+begin_src emacs-lisp
+(gptel-make-azure
+ "Azure-1"                              ;Name, whatever you'd like
+ :protocol "https"                      ;optional -- https is the default
+ :host "YOUR_RESOURCE_NAME.openai.azure.com"
+ :endpoint "/openai/deployments/YOUR_DEPLOYMENT_NAME/completions?api-version=2023-05-15" ;or equivalent
+ :stream t                              ;Enable streaming responses
+ :models '("gpt-3.5-turbo" "gpt-4"))
+#+end_src
+Refer to the documentation of =gptel-make-azure= to set more parameters.
+
+You can pick this backend from the transient menu when using gptel. (See usage)
+
+If you want it to be the default, set it as the default value of =gptel-backend=:
+#+begin_src emacs-lisp
+(setq-default gptel-backend
+              (gptel-make-azure
+               "Azure-1"
+               ...))
+#+end_src
+#+html: </details>
+
+#+html: <details><summary>
+**** GPT4All
+#+html: </summary>
+
+Register a backend with
+#+begin_src emacs-lisp
+(gptel-make-gpt4all
+ "GPT4All"                              ;Name of your choosing
+ :protocol "http"                       
+ :host "localhost:4891"                 ;Where it's running
+ :models '("mistral-7b-openorca.Q4_0.gguf")) ;Available models
+#+end_src
+These are the required parameters, refer to the documentation of =gptel-make-gpt4all= for more.
+
+You can pick this backend from the transient menu when using gptel (see usage), or set this as the default value of =gptel-backend=.
+
+#+html: </details>
+
+#+html: <details><summary>
+**** Ollama
+#+html: </summary>
+
+Register a backend with
+#+begin_src emacs-lisp
+(defvar gptel--ollama
+  (gptel-make-ollama
+   "Ollama"                             ;Any name of your choosing
+   :host "localhost:11434"              ;Where it's running
+   :models '("mistral:latest")          ;Installed models
+   :stream t))                          ;Stream responses
+#+end_src
+These are the required parameters, refer to the documentation of =gptel-make-gpt4all= for more.
+
+You can pick this backend from the transient menu when using gptel (see usage), or set this as the default value of =gptel-backend=.
+
+#+html: </details>
+
+** Usage
 *** In any buffer:

 1. Select a region of text and call =M-x gptel-send=. The response will be inserted below your region.
@ -122,11 +208,11 @@ With a region selected, you can also rewrite prose or refactor code from here:

 *** In a dedicated chat buffer:

-1. Run =M-x gptel= to start or switch to the ChatGPT buffer. It will ask you for the key if you skipped the previous step. Run it with a prefix-arg (=C-u M-x gptel=) to start a new session.
+1. Run =M-x gptel= to start or switch to the chat buffer. It will ask you for the key if you skipped the previous step. Run it with a prefix-arg (=C-u M-x gptel=) to start a new session.

 2. In the gptel buffer, send your prompt with =M-x gptel-send=, bound to =C-c RET=.

-3. Set chat parameters (GPT model, directives etc) for the session by calling =gptel-send= with a prefix argument (=C-u C-c RET=):
+3. Set chat parameters (LLM provider, model, directives etc) for the session by calling =gptel-send= with a prefix argument (=C-u C-c RET=):

 [[https://user-images.githubusercontent.com/8607532/224946059-9b918810-ab8b-46a6-b917-549d50c908f2.png]]

@ -157,17 +243,29 @@ These are packages that depend on GPTel to provide additional functionality
 - [[https://github.com/kamushadenes/ai-blog.el][ai-blog.el]]: Streamline generation of blog posts in Hugo.

 ** Additional Configuration
+:PROPERTIES:
+:ID:       f885adac-58a3-4eba-a6b7-91e9e7a17829
+:END:

- =gptel-host=: Overrides the OpenAI API host.  This is useful for those who transform Azure API into OpenAI API format, utilize reverse proxy, or employ third-party proxy services for the OpenAI API.
+#+begin_src emacs-lisp :exports none
+(let ((all))
+  (mapatoms (lambda (sym)
+              (when (and (string-match-p "^gptel-[^-]" (symbol-name sym))
+                         (get sym 'variable-documentation))
+                (push sym all))))
+  all)
+#+end_src
+
+- =gptel-stream=: Stream responses (if the model supports streaming). Defaults to true.

 - =gptel-proxy=: Path to a proxy to use for GPTel interactions. This is passed to Curl via the =--proxy= argument.

-** Why another ChatGPT client?
+** Why another LLM client?

-Other Emacs clients for ChatGPT prescribe the format of the interaction (a comint shell, org-babel blocks, etc).  I wanted:
+Other Emacs clients for LLMs prescribe the format of the interaction (a comint shell, org-babel blocks, etc).  I wanted:

-1. Something that is as free-form as possible: query ChatGPT using any text in any buffer, and redirect the response as required.  Using a dedicated =gptel= buffer just adds some visual flair to the interaction.
-2. Integration with org-mode, not using a walled-off org-babel block, but as regular text.  This way ChatGPT can generate code blocks that I can run.
+1. Something that is as free-form as possible: query the model using any text in any buffer, and redirect the response as required.  Using a dedicated =gptel= buffer just adds some visual flair to the interaction.
+2. Integration with org-mode, not using a walled-off org-babel block, but as regular text.  This way the model can generate code blocks that I can run.

 ** Will you add feature X?

@ -183,13 +281,14 @@ Maybe, I'd like to experiment a bit more first.  Features added since the incept
 - A built-in refactor/rewrite prompt
 - Limiting conversation context to Org headings using properties (#58)
 - Saving and restoring chats (#17)
+- Support for local LLMs.

 Features being considered or in the pipeline:
 - Fully stateless design (#17)

 ** Alternatives

-Other Emacs clients for ChatGPT include
+Other Emacs clients for LLMs include

 - [[https://github.com/xenodium/chatgpt-shell][chatgpt-shell]]: comint-shell based interaction with ChatGPT.  Also supports DALL-E, executable code blocks in the responses, and more.
 - [[https://github.com/rksm/org-ai][org-ai]]: Interaction through special =#+begin_ai ... #+end_ai= Org-mode blocks.  Also supports DALL-E, querying ChatGPT with the contents of project files, and more.
--- a/gptel-curl.el
+++ b/gptel-curl.el
@ -41,14 +41,16 @@
  "Produce list of arguments for calling Curl.

 PROMPTS is the data to send, TOKEN is a unique identifier."
-  (let* ((url (format "%s://%s/v1/chat/completions"
-                      gptel-protocol gptel-host))
+  (let* ((url (gptel-backend-url gptel-backend))
         (data (encode-coding-string
-                (json-encode (gptel--request-data prompts))
+                (json-encode (gptel--request-data gptel-backend prompts))
                'utf-8))
         (headers
-          `(("Content-Type" . "application/json")
-            ("Authorization" . ,(concat "Bearer " (gptel--api-key))))))
+          (append '(("Content-Type" . "application/json"))
+                  (when-let ((backend-header (gptel-backend-header gptel-backend)))
+                    (if (functionp backend-header)
+                        (funcall backend-header)
+                      backend-header)))))
    (append
     (list "--location" "--silent" "--compressed" "--disable"
           (format "-X%s" "POST")
@ -81,14 +83,33 @@ the response is inserted into the current buffer after point."
                             (random) (emacs-pid) (user-full-name)
                             (recent-keys))))
         (args (gptel-curl--get-args (plist-get info :prompt) token))
+         (stream (and gptel-stream (gptel-backend-stream gptel-backend)))
         (process (apply #'start-process "gptel-curl"
                         (generate-new-buffer "*gptel-curl*") "curl" args)))
+    (when gptel--debug
+      (message "%S" args))
    (with-current-buffer (process-buffer process)
      (set-process-query-on-exit-flag process nil)
      (setf (alist-get process gptel-curl--process-alist)
            (nconc (list :token token
+                         ;; FIXME `aref' breaks `cl-struct' abstraction boundary
+                         ;; FIXME `cl--generic-method' is an internal `cl-struct'
+                         :parser (cl--generic-method-function
+                                  (if stream
+                                      (cl-find-method
+                                       'gptel-curl--parse-stream nil
+                                       (list
+                                        (aref (buffer-local-value
+                                               'gptel-backend (plist-get info :buffer))
+                                              0) t))
+                                    (cl-find-method
+                                     'gptel--parse-response nil
+                                     (list
+                                      (aref (buffer-local-value
+                                             'gptel-backend (plist-get info :buffer))
+                                            0) t t))))
                         :callback (or callback
-                                       (if gptel-stream
+                                       (if stream
                                           #'gptel-curl--stream-insert-response
                                         #'gptel--insert-response))
                         :transformer (when (eq (buffer-local-value
@ -97,7 +118,7 @@ the response is inserted into the current buffer after point."
                                                'org-mode)
                                        (gptel--stream-convert-markdown->org)))
                   info))
-      (if gptel-stream
+      (if stream
          (progn (set-process-sentinel process #'gptel-curl--stream-cleanup)
                 (set-process-filter process #'gptel-curl--stream-filter))
        (set-process-sentinel process #'gptel-curl--sentinel)))))
@ -252,22 +273,21 @@ See `gptel--url-get-response' for details."
        (when (equal http-status "200")
          (funcall (or (plist-get proc-info :callback)
                       #'gptel-curl--stream-insert-response)
-                   (let* ((json-object-type 'plist)
-                          (content-strs))
-                     (condition-case nil
-                         (while (re-search-forward "^data:" nil t)
-                           (save-match-data
-                             (unless (looking-at " *\\[DONE\\]")
-                               (when-let* ((response (json-read))
-                                           (delta (map-nested-elt
-                                                   response '(:choices 0 :delta)))
-                                           (content (plist-get delta :content)))
-                                 (push content content-strs)))))
-                       (error
-                        (goto-char (match-beginning 0))))
-                     (apply #'concat (nreverse content-strs)))
+                   (funcall (plist-get proc-info :parser) nil proc-info)
                   proc-info))))))

+(cl-defgeneric gptel-curl--parse-stream (backend proc-info)
+  "Stream parser for gptel-curl.
+
+Implementations of this function run as part of the process
+filter for the active query, and return partial responses from
+the LLM.
+
+BACKEND is the LLM backend in use.
+
+PROC-INFO is a plist with process information and other context.
+See `gptel-curl--get-response' for its contents.")
+
 (defun gptel-curl--sentinel (process _status)
  "Process sentinel for GPTel curl requests.

@ -278,30 +298,27 @@ PROCESS and _STATUS are process parameters."
        (clone-buffer "*gptel-error*" 'show)))
    (when-let* (((eq (process-status process) 'exit))
                (proc-info (alist-get process gptel-curl--process-alist))
-                (proc-token (plist-get proc-info :token))
                (proc-callback (plist-get proc-info :callback)))
      (pcase-let ((`(,response ,http-msg ,error)
-                   (gptel-curl--parse-response proc-buf proc-token)))
+                   (with-current-buffer proc-buf
+                     (gptel-curl--parse-response proc-info))))
        (plist-put proc-info :status http-msg)
        (when error (plist-put proc-info :error error))
        (funcall proc-callback response proc-info)))
    (setf (alist-get process gptel-curl--process-alist nil 'remove) nil)
    (kill-buffer proc-buf)))

-(defun gptel-curl--parse-response (buf token)
+(defun gptel-curl--parse-response (proc-info)
  "Parse the buffer BUF with curl's response.

 TOKEN is used to disambiguate multiple requests in a single
 buffer."
-  (with-current-buffer buf
-    (progn
+  (let ((token (plist-get proc-info :token))
+        (parser (plist-get proc-info :parser)))
    (goto-char (point-max))
    (search-backward token)
    (backward-char)
    (pcase-let* ((`(,_ . ,header-size) (read (current-buffer))))
-          ;; (if (search-backward token nil t)
-          ;;     (search-forward ")" nil t)
-          ;;   (goto-char (point-min)))
      (goto-char (point-min))

      (if-let* ((http-msg (string-trim
@ -319,7 +336,7 @@ buffer."
          (cond
           ((equal http-status "200")
            (list (string-trim
-                       (map-nested-elt response '(:choices 0 :message :content)))
+                   (funcall parser nil response proc-info))
                  http-msg))
           ((plist-get response :error)
            (let* ((error-plist (plist-get response :error))
@ -332,7 +349,7 @@ buffer."
           (t (list nil (concat "(" http-msg ") Could not parse HTTP response.")
                    "Could not parse HTTP response.")))
        (list nil (concat "(" http-msg ") Could not parse HTTP response.")
-                  "Could not parse HTTP response."))))))
+              "Could not parse HTTP response.")))))

 (provide 'gptel-curl)
 ;;; gptel-curl.el ends here
--- a/gptel-ollama.el
+++ b/gptel-ollama.el
@ -0,0 +1,143 @@
+;;; gptel-ollama.el --- Ollama support for gptel     -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2023  Karthik Chikmagalur
+
+;; Author: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
+;; Keywords: hypermedia
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;; This file adds support for the Ollama LLM API to gptel 
+
+;;; Code:
+(require 'gptel)
+(require 'cl-generic)
+
+;;; Ollama
+(cl-defstruct (gptel-ollama (:constructor gptel--make-ollama)
+                            (:copier nil)
+                            (:include gptel-backend)))
+
+(cl-defmethod gptel-curl--parse-stream ((_backend gptel-ollama) info)
+  ";TODO: "
+  (when (bobp)
+    (re-search-forward "^{")
+    (forward-line 0))
+  (let* ((json-object-type 'plist)
+         (content-strs)
+         (content))
+    (condition-case nil
+        (while (setq content (json-read))
+          (let ((done (map-elt content :done))
+                (response (map-elt content :response)))
+            (push response content-strs)
+            (unless (eq done json-false)
+              (with-current-buffer (plist-get info :buffer)
+                (setq gptel--ollama-context (map-elt content :context)))
+              (end-of-buffer))))
+      (error (forward-line 0)))
+    (apply #'concat (nreverse content-strs))))
+
+(cl-defmethod gptel--parse-response ((_backend gptel-ollama) response info)
+  (when-let ((context (map-elt response :context)))
+    (with-current-buffer (plist-get info :buffer)
+      (setq gptel--ollama-context context)))
+  (map-elt response :response))
+
+(cl-defmethod gptel--request-data ((_backend gptel-ollama) prompts)
+  "JSON encode PROMPTS for sending to ChatGPT."
+  (let ((prompts-plist
+         `(:model ,gptel-model
+           ,@prompts
+           :stream ,(or (and gptel-stream gptel-use-curl
+                             (gptel-backend-stream gptel-backend))
+                     :json-false))))
+    (when gptel--ollama-context
+      (plist-put prompts-plist :context gptel--ollama-context))
+    prompts-plist))
+
+(cl-defmethod gptel--parse-buffer ((_backend gptel-ollama) &optional _max-entries)
+  (let ((prompts) (prop))
+    (setq prop (text-property-search-backward
+                'gptel 'response
+                (when (get-char-property (max (point-min) (1- (point)))
+                                         'gptel)
+                  t)))
+    (if (prop-match-value prop)
+        (user-error "No user prompt found!")
+      (setq prompts (list
+                     :system gptel--system-message
+                     :prompt
+                     (string-trim (buffer-substring-no-properties (prop-match-beginning prop)
+                                                                  (prop-match-end prop))
+                                  "[*# \t\n\r]+"))))))
+
+;;;###autoload
+(cl-defun gptel-make-ollama
+    (name &key host header key models stream
+          (protocol "http")
+          (endpoint "/api/generate"))
+  "Register an Ollama backend for gptel with NAME.
+
+Keyword arguments:
+
+HOST is where Ollama runs (with port), typically localhost:11434
+
+MODELS is a list of available model names.
+
+STREAM is a boolean to toggle streaming responses, defaults to
+false.
+
+PROTOCOL (optional) specifies the protocol, http by default.
+
+ENDPOINT (optional) is the API endpoint for completions, defaults to
+\"/api/generate\".
+
+HEADER (optional) is for additional headers to send with each
+request. It should be an alist or a function that retuns an
+alist, like:
+((\"Content-Type\" . \"application/json\"))
+
+KEY (optional) is a variable whose value is the API key, or
+function that returns the key. This is typically not required for
+local models like Ollama."
+  (let ((backend (gptel--make-ollama
+                  :name name
+                  :host host
+                  :header header
+                  :key key
+                  :models models
+                  :protocol protocol
+                  :endpoint endpoint
+                  :stream stream
+                  :url (if protocol
+                           (concat protocol "://" host endpoint)
+                         (concat host endpoint)))))
+    (prog1 backend
+      (setf (alist-get name gptel--known-backends
+                       nil nil #'equal)
+                  backend))))
+
+(defvar-local gptel--ollama-context nil
+  "Context for ollama conversations.
+
+This variable holds the context array for conversations with
+Ollama models.")
+
+(provide 'gptel-ollama)
+;;; gptel-ollama.el ends here
+
+
--- a/gptel-openai.el
+++ b/gptel-openai.el
@ -0,0 +1,216 @@
+;;; gptel-openai.el ---  ChatGPT suppport for gptel  -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2023  Karthik Chikmagalur
+
+;; Author: Karthik Chikmagalur <karthikchikmagalur@gmail.com>
+;; Keywords: 
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;; This file adds support for the ChatGPT API to gptel
+
+;;; Code:
+(require 'cl-generic)
+
+;;; Common backend struct for LLM support
+(cl-defstruct
+    (gptel-backend (:constructor gptel--make-backend)
+                   (:copier gptel--copy-backend))
+  name host header protocol stream
+  endpoint key models url)
+
+;;; OpenAI (ChatGPT)
+(cl-defstruct (gptel-openai (:constructor gptel--make-openai)
+                            (:copier nil)
+                            (:include gptel-backend)))
+
+(cl-defmethod gptel-curl--parse-stream ((_backend gptel-openai) _info)
+  (let* ((json-object-type 'plist)
+         (content-strs))
+    (condition-case nil
+        (while (re-search-forward "^data:" nil t)
+          (save-match-data
+            (unless (looking-at " *\\[DONE\\]")
+              (when-let* ((response (json-read))
+                          (delta (map-nested-elt
+                                  response '(:choices 0 :delta)))
+                          (content (plist-get delta :content)))
+                (push content content-strs)))))
+      (error
+       (goto-char (match-beginning 0))))
+    (apply #'concat (nreverse content-strs))))
+
+(cl-defmethod gptel--parse-response ((_backend gptel-openai) response _info)
+  (map-nested-elt response '(:choices 0 :message :content)))
+
+(cl-defmethod gptel--request-data ((_backend gptel-openai) prompts)
+  "JSON encode PROMPTS for sending to ChatGPT."
+  (let ((prompts-plist
+         `(:model ,gptel-model
+           :messages [,@prompts]
+           :stream ,(or (and gptel-stream gptel-use-curl
+                         (gptel-backend-stream gptel-backend))
+                     :json-false))))
+    (when gptel-temperature
+      (plist-put prompts-plist :temperature gptel-temperature))
+    (when gptel-max-tokens
+      (plist-put prompts-plist :max_tokens gptel-max-tokens))
+    prompts-plist))
+
+(cl-defmethod gptel--parse-buffer ((_backend gptel-openai) &optional max-entries)
+  (let ((prompts) (prop))
+    (while (and
+            (or (not max-entries) (>= max-entries 0))
+            (setq prop (text-property-search-backward
+                        'gptel 'response
+                        (when (get-char-property (max (point-min) (1- (point)))
+                                                 'gptel)
+                          t))))
+      (push (list :role (if (prop-match-value prop) "assistant" "user")
+                  :content
+                  (string-trim
+                   (buffer-substring-no-properties (prop-match-beginning prop)
+                                                   (prop-match-end prop))
+                   "[*# \t\n\r]+"))
+            prompts)
+      (and max-entries (cl-decf max-entries)))
+    (cons (list :role "system"
+                :content gptel--system-message)
+          prompts)))
+
+;;;###autoload
+(cl-defun gptel-make-openai
+    (name &key header key models stream
+          (host "api.openai.com")
+          (protocol "https")
+          (endpoint "/v1/chat/completions"))
+  "Register a ChatGPT backend for gptel with NAME.
+
+Keyword arguments:
+
+HOST (optional) is the API host, typically \"api.openai.com\".
+
+MODELS is a list of available model names.
+
+STREAM is a boolean to toggle streaming responses, defaults to
+false.
+
+PROTOCOL (optional) specifies the protocol, https by default.
+
+ENDPOINT (optional) is the API endpoint for completions, defaults to
+\"/v1/chat/completions\".
+
+HEADER (optional) is for additional headers to send with each
+request. It should be an alist or a function that retuns an
+alist, like:
+((\"Content-Type\" . \"application/json\"))
+
+KEY (optional) is a variable whose value is the API key, or
+function that returns the key."  
+  (let ((backend (gptel--make-openai
+                  :name name
+                  :host host
+                  :header header
+                  :key key
+                  :models models
+                  :protocol protocol
+                  :endpoint endpoint
+                  :stream stream
+                  :url (if protocol
+                           (concat protocol "://" host endpoint)
+                         (concat host endpoint)))))
+    (prog1 backend
+      (setf (alist-get name gptel--known-backends
+                       nil nil #'equal)
+                  backend))))
+
+;;; Azure
+;;;###autoload
+(cl-defun gptel-make-azure
+    (name &key host
+          (protocol "https")
+          (header (lambda () `(("api-key" . ,(gptel--get-api-key)))))
+          (key 'gptel-api-key)
+          models stream endpoint)
+  "Register an Azure backend for gptel with NAME.
+
+Keyword arguments:
+
+HOST is the API host.
+
+MODELS is a list of available model names.
+
+STREAM is a boolean to toggle streaming responses, defaults to
+false.
+
+PROTOCOL (optional) specifies the protocol, https by default.
+
+ENDPOINT is the API endpoint for completions.
+
+HEADER (optional) is for additional headers to send with each
+request. It should be an alist or a function that retuns an
+alist, like:
+((\"Content-Type\" . \"application/json\"))
+
+KEY (optional) is a variable whose value is the API key, or
+function that returns the key."
+  (let ((backend (gptel--make-openai
+                  :name name
+                  :host host
+                  :header header
+                  :key key
+                  :models models
+                  :protocol protocol
+                  :endpoint endpoint
+                  :stream stream
+                  :url (if protocol
+                           (concat protocol "://" host endpoint)
+                         (concat host endpoint)))))
+    (prog1 backend
+      (setf (alist-get name gptel--known-backends
+                       nil nil #'equal)
+            backend))))
+
+;; GPT4All
+;;;###autoload
+(defalias 'gptel-make-gpt4all 'gptel-make-openai
+  "Register a GPT4All backend for gptel with NAME.
+
+Keyword arguments:
+
+HOST is where GPT4All runs (with port), typically localhost:8491
+
+MODELS is a list of available model names.
+
+STREAM is a boolean to toggle streaming responses, defaults to
+false.
+
+PROTOCOL specifies the protocol, https by default.
+
+ENDPOINT (optional) is the API endpoint for completions, defaults to
+\"/api/v1/completions\"
+
+HEADER (optional) is for additional headers to send with each
+request. It should be an alist or a function that retuns an
+alist, like:
+((\"Content-Type\" . \"application/json\"))
+
+KEY (optional) is a variable whose value is the API key, or
+function that returns the key. This is typically not required for
+local models like GPT4All.")
+
+(provide 'gptel-openai)
+;;; gptel-backends.el ends here
--- a/gptel-transient.el
+++ b/gptel-transient.el
@ -116,23 +116,24 @@ which see."
                  gptel--system-message (max (- (window-width) 14) 20) nil nil t)))
   ("h" "Set directives for chat" gptel-system-prompt :transient t)]
  [["Session Parameters"
+    (gptel--infix-provider)
+    ;; (gptel--infix-model)
    (gptel--infix-max-tokens)
    (gptel--infix-num-messages-to-send)
-    (gptel--infix-temperature)
-    (gptel--infix-model)]
+    (gptel--infix-temperature)]
   ["Prompt:"
-    ("-r" "From minibuffer instead" "-r")
-    ("-i" "Replace/Delete prompt" "-i")
+    ("p" "From minibuffer instead" "p")
+    ("i" "Replace/Delete prompt" "i")
    "Response to:"
-    ("-m" "Minibuffer instead" "-m")
-    ("-n" "New session" "-n"
+    ("m" "Minibuffer instead" "m")
+    ("n" "New session" "n"
     :class transient-option
     :prompt "Name for new session: "
     :reader
     (lambda (prompt _ history)
       (read-string
        prompt (generate-new-buffer-name "*ChatGPT*") history)))
-    ("-e" "Existing session" "-e"
+    ("e" "Existing session" "e"
     :class transient-option
     :prompt "Existing session: "
     :reader
@ -142,7 +143,7 @@ which see."
        (lambda (buf) (and (buffer-local-value 'gptel-mode (get-buffer buf))
                      (not (equal (current-buffer) buf))))
        t nil history)))
-    ("-k" "Kill-ring" "-k")]
+    ("k" "Kill-ring" "k")]
   [:description gptel--refactor-or-rewrite
    :if use-region-p
    ("r"
@ -245,7 +246,7 @@ include."
  :description "Number of past messages to send"
  :class 'transient-lisp-variable
  :variable 'gptel--num-messages-to-send
-  :key "n"
+  :key "-n"
  :prompt "Number of past messages to include for context (leave empty for all): "
  :reader 'gptel--transient-read-variable)

@ -262,16 +263,61 @@ will get progressively longer!"
  :description "Response length (tokens)"
  :class 'transient-lisp-variable
  :variable 'gptel-max-tokens
-  :key "<"
+  :key "-c"
  :prompt "Response length in tokens (leave empty: default, 80-200: short, 200-500: long): "
  :reader 'gptel--transient-read-variable)

+(defclass gptel-provider-variable (transient-lisp-variable)
+  ((model       :initarg :model)
+   (model-value :initarg :model-value)
+   (always-read :initform t)
+   (set-value :initarg :set-value :initform #'set))
+  "Class used for gptel-backends.")
+
+(cl-defmethod transient-format-value ((obj gptel-provider-variable))
+  (propertize (concat (gptel-backend-name (oref obj value)) ":"
+                      (buffer-local-value (oref obj model) transient--original-buffer))
+              'face 'transient-value))
+
+(cl-defmethod transient-infix-set ((obj gptel-provider-variable) value)
+  (pcase-let ((`(,backend-value ,model-value) value))
+    (funcall (oref obj set-value)
+             (oref obj variable)
+             (oset obj value backend-value))
+    (funcall (oref obj set-value)
+             (oref obj model)
+             (oset obj model-value model-value))))
+
+(transient-define-infix gptel--infix-provider ()
+  "AI Provider for Chat."
+  :description "GPT Model: "
+  :class 'gptel-provider-variable
+  :prompt "Model provider: "
+  :variable 'gptel-backend
+  :model 'gptel-model
+  :key "-m"
+  :reader (lambda (prompt &rest _)
+            (let* ((backend-name 
+                    (if (<= (length gptel--known-backends) 1)
+                        (caar gptel--known-backends)
+                      (completing-read
+                       prompt
+                       (mapcar #'car gptel--known-backends))))
+                   (backend (alist-get backend-name gptel--known-backends
+                                nil nil #'equal))
+                   (backend-models (gptel-backend-models backend))
+                   (model-name (if (= (length backend-models) 1)
+                                   (car backend-models)
+                                 (completing-read
+                                  "Model: " backend-models))))
+              (list backend model-name))))
+
 (transient-define-infix gptel--infix-model ()
  "AI Model for Chat."
  :description "GPT Model: "
  :class 'transient-lisp-variable
  :variable 'gptel-model
-  :key "m"
+  :key "-m"
  :choices '("gpt-3.5-turbo" "gpt-3.5-turbo-16k" "gpt-4" "gpt-4-32k")
  :reader (lambda (prompt &rest _)
            (completing-read
@ -283,7 +329,7 @@ will get progressively longer!"
  :description "Randomness (0 - 2.0)"
  :class 'transient-lisp-variable
  :variable 'gptel-temperature
-  :key "t"
+  :key "-t"
  :prompt "Set temperature (0.0-2.0, leave empty for default): "
  :reader 'gptel--transient-read-variable)

@ -313,42 +359,43 @@ will get progressively longer!"
  :description "Send prompt"
  (interactive (list (transient-args transient-current-command)))
  (let ((stream gptel-stream)
-        (in-place (and (member "-i" args) t))
+        (in-place (and (member "i" args) t))
        (output-to-other-buffer-p)
+        (backend-name (gptel-backend-name gptel-backend))
        (buffer) (position)
        (callback) (gptel-buffer-name)
        (prompt
-         (and (member "-r" args)
+         (and (member "p" args)
              (read-string
-               "Ask ChatGPT: "
+               (format "Ask %s: " (gptel-backend-name gptel-backend))
               (apply #'buffer-substring-no-properties
                      (if (use-region-p)
                          (list (region-beginning) (region-end))
                        (list (line-beginning-position) (line-end-position))))))))
    (cond
-     ((member "-m" args)
+     ((member "m" args)
      (setq stream nil)
      (setq callback
            (lambda (resp info)
              (if resp
-                  (message "ChatGPT response: %s" resp)
-                (message "ChatGPT response error: %s" (plist-get info :status))))))
-     ((member "-k" args)
+                  (message "%s response: %s" backend-name resp)
+                (message "%s response error: %s" backend-name (plist-get info :status))))))
+     ((member "k" args)
      (setq stream nil)
      (setq callback
            (lambda (resp info)
              (if (not resp)
-                  (message "ChatGPT response error: %s" (plist-get info :status))
+                  (message "%s response error: %s" backend-name (plist-get info :status))
                (kill-new resp)
-                (message "ChatGPT response: copied to kill-ring.")))))
+                (message "%s response: copied to kill-ring." backend-name)))))
     ((setq gptel-buffer-name
-            (cl-some (lambda (s) (and (string-prefix-p "-n" s)
-                                 (substring s 2)))
+            (cl-some (lambda (s) (and (string-prefix-p "n" s)
+                                 (substring s 1)))
                     args))
      (setq buffer
            (gptel gptel-buffer-name
                   (condition-case nil
-                       (gptel--api-key)
+                       (gptel--get-api-key)
                     ((error user-error)
                      (setq gptel-api-key
                            (read-passwd "OpenAI API key: "))))
@ -370,7 +417,7 @@ will get progressively longer!"
        (setq position (point)))
      (setq output-to-other-buffer-p t))
     ((setq gptel-buffer-name
-            (cl-some (lambda (s) (and (string-prefix-p "-e" s)
+            (cl-some (lambda (s) (and (string-prefix-p "e" s)
                                 (substring s 2)))
                     args))
      (setq buffer (get-buffer gptel-buffer-name))
--- a/gptel.el
+++ b/gptel.el
@ -76,18 +76,20 @@
 (require 'json)
 (require 'map)
 (require 'text-property-search)
+(require 'gptel-openai)

 (defgroup gptel nil
  "Interact with ChatGPT from anywhere in Emacs."
  :group 'hypermedia)

-(defcustom gptel-host "api.openai.com"
-  "The API host queried by gptel."
-  :group 'gptel
-  :type 'string)
-
-(defvar gptel-protocol "https"
-  "Protocol used to query `gptel-host'.")
+;; (defcustom gptel-host "api.openai.com"
+;;   "The API host queried by gptel."
+;;   :group 'gptel
+;;   :type 'string)
+(make-obsolete-variable
+ 'gptel-host
+ "Use `gptel-make-openai' instead."
+ "0.5.0")

 (defcustom gptel-proxy ""
  "Path to a proxy to use for gptel interactions.
@ -257,7 +259,7 @@ will get progressively longer!"
 (defcustom gptel-model "gpt-3.5-turbo"
  "GPT Model for chat.

-The current options are
+The current options for ChatGPT are
 - \"gpt-3.5-turbo\"
 - \"gpt-3.5-turbo-16k\"
 - \"gpt-4\" (experimental)
@ -287,18 +289,44 @@ To set the temperature for a chat session interactively call
  :group 'gptel
  :type 'number)

+(defvar gptel--known-backends nil
+  "Alist of LLM backends known to gptel.
+
+This is an alist mapping user-provided names to backend structs,
+see `gptel-backend'.
+
+You can have more than one backend pointing to the same resource
+with differing settings.")
+
+(defvar gptel--openai
+  (gptel-make-openai
+   "ChatGPT"
+   :header (lambda () `(("Authorization" . ,(concat "Bearer " (gptel--get-api-key)))))
+   :key #'gptel--get-api-key
+   :stream t
+   :models '("gpt-3.5-turbo" "gpt-3.5-turbo-16k" "gpt-4" "gpt-4-32k")))
+
+(defvar-local gptel-backend gptel--openai)
+
 (defvar-local gptel--bounds nil)
 (put 'gptel--bounds 'safe-local-variable #'gptel--always)

 (defvar-local gptel--num-messages-to-send nil)
 (put 'gptel--num-messages-to-send 'safe-local-variable #'gptel--always)
-(defvar gptel--debug nil)
+
+(defvar gptel--debug nil
+  "Enable printing debug messages.
+
+Also shows the response buffer when making requests.")

 (defun gptel-api-key-from-auth-source (&optional host user)
  "Lookup api key in the auth source.
-By default, `gptel-host' is used as HOST and \"apikey\" as USER."
-  (if-let ((secret (plist-get (car (auth-source-search
-                                    :host (or host gptel-host)
+By default, the LLM host for the active backend is used as HOST,
+and \"apikey\" as USER."
+  (if-let ((secret
+            (plist-get
+             (car (auth-source-search
+                   :host (or host (gptel-backend-host gptel-backend))
                   :user (or user "apikey")
                   :require '(:secret)))
                              :secret)))
@ -308,7 +336,7 @@ By default, `gptel-host' is used as HOST and \"apikey\" as USER."
    (user-error "No `gptel-api-key' found in the auth source")))

 ;; FIXME Should we utf-8 encode the api-key here?
-(defun gptel--api-key ()
+(defun gptel--get-api-key ()
  "Get api key from `gptel-api-key'."
  (pcase gptel-api-key
    ((pred stringp) gptel-api-key)
@ -336,7 +364,8 @@ Currently saving and restoring state is implemented only for
             (progn
               (when-let ((bounds (org-entry-get (point-min) "GPTEL_BOUNDS")))
                 (mapc (pcase-lambda (`(,beg . ,end))
-                         (put-text-property beg end 'gptel 'response))
+                         (add-text-properties
+                          beg end '(gptel response rear-nonsticky t)))
                       (read bounds))
                 (message "gptel chat restored."))
               (when-let ((model (org-entry-get (point-min) "GPTEL_MODEL")))
@ -431,8 +460,8 @@ opening the file."
        (gptel--restore-state)
        (setq gptel--old-header-line header-line-format
              header-line-format
-              (list (concat (propertize " " 'display '(space :align-to 0))
-                            (format "%s" (buffer-name)))
+              (list '(:eval (concat (propertize " " 'display '(space :align-to 0))
+                                    (format "%s" (gptel-backend-name gptel-backend))))
                    (propertize " Ready" 'face 'success)
                    '(:eval
                      (let* ((l1 (length gptel-model))
@ -468,8 +497,8 @@ opening the file."
 (cl-defun gptel-request
    (&optional prompt &key callback
               (buffer (current-buffer))
-               position context (stream nil)
-               (in-place nil)
+               position context
+               (stream nil) (in-place nil)
               (system gptel--system-message))
  "Request a response from ChatGPT for PROMPT.

@ -581,7 +610,7 @@ instead."
  (interactive "P")
  (if (and arg (require 'gptel-transient nil t))
      (call-interactively #'gptel-menu)
-  (message "Querying ChatGPT...")
+  (message "Querying %s..." (gptel-backend-name gptel-backend))
  (let* ((response-pt
          (if (use-region-p)
              (set-marker (make-marker) (region-end))
@ -698,38 +727,24 @@ there."
          (goto-char (point-max))))
       (t (goto-char (or prompt-end (point-max)))))
      (let ((max-entries (and gptel--num-messages-to-send
-                              (* 2 gptel--num-messages-to-send)))
-            (prop) (prompts))
-        (while (and
-                (or (not max-entries) (>= max-entries 0))
-                (setq prop (text-property-search-backward
-                            'gptel 'response
-                            (when (get-char-property (max (point-min) (1- (point)))
-                                                     'gptel)
-                              t))))
-          (push (list :role (if (prop-match-value prop) "assistant" "user")
-                      :content
-                      (string-trim
-                       (buffer-substring-no-properties (prop-match-beginning prop)
-                                                       (prop-match-end prop))
-                       "[*# \t\n\r]+"))
-                prompts)
-          (and max-entries (cl-decf max-entries)))
-        (cons (list :role "system"
-                    :content gptel--system-message)
-              prompts)))))
+                              (* 2 gptel--num-messages-to-send))))
+        (gptel--parse-buffer gptel-backend max-entries)))))

-(defun gptel--request-data (prompts)
-  "JSON encode PROMPTS for sending to ChatGPT."
-  (let ((prompts-plist
-         `(:model ,gptel-model
-           :messages [,@prompts]
-           :stream ,(or (and gptel-stream gptel-use-curl) :json-false))))
-    (when gptel-temperature
-      (plist-put prompts-plist :temperature gptel-temperature))
-    (when gptel-max-tokens
-      (plist-put prompts-plist :max_tokens gptel-max-tokens))
-    prompts-plist))
+(cl-defgeneric gptel--parse-buffer (backend max-entries)
+  "Parse the current buffer backwards from point and return a list
+of prompts.
+
+BACKEND is the LLM backend in use.
+
+MAX-ENTRIES is the number of queries/responses to include for
+contexbt.")
+
+(cl-defgeneric gptel--request-data (backend prompts)
+  "Generate a plist of all data for an LLM query.
+
+BACKEND is the LLM backend in use.
+
+PROMPTS is the plist of previous user queries and LLM responses.")

 ;; TODO: Use `run-hook-wrapped' with an accumulator instead to handle
 ;; buffer-local hooks, etc.
@ -773,13 +788,17 @@ the response is inserted into the current buffer after point."
         (message-log-max nil)
         (url-request-method "POST")
         (url-request-extra-headers
-         `(("Content-Type" . "application/json")
-           ("Authorization" . ,(concat "Bearer " (gptel--api-key)))))
+          (append '(("Content-Type" . "application/json"))
+                  (when-let ((backend-header (gptel-backend-header gptel-backend)))
+                    (if (functionp backend-header)
+                        (funcall backend-header)
+                      backend-header))))
        (url-request-data
         (encode-coding-string
-          (json-encode (gptel--request-data (plist-get info :prompt)))
+          (json-encode (gptel--request-data
+                        gptel-backend (plist-get info :prompt)))
          'utf-8)))
-    (url-retrieve (format "%s://%s/v1/chat/completions" gptel-protocol gptel-host)
+    (url-retrieve (gptel-backend-url gptel-backend)
                  (lambda (_)
                    (pcase-let ((`(,response ,http-msg ,error)
                                 (gptel--url-parse-response (current-buffer))))
@ -790,6 +809,16 @@ the response is inserted into the current buffer after point."
                      (kill-buffer)))
                  nil t nil)))

+(cl-defgeneric gptel--parse-response (backend response proc-info)
+  "Response extractor for LLM requests.
+
+BACKEND is the LLM backend in use.
+
+RESPONSE is the parsed JSON of the response, as a plist.
+
+PROC-INFO is a plist with process information and other context.
+See `gptel-curl--get-response' for its contents.")
+
 (defun gptel--url-parse-response (response-buffer)
  "Parse response in RESPONSE-BUFFER."
  (when (buffer-live-p response-buffer)
@ -809,7 +838,8 @@ the response is inserted into the current buffer after point."
                                     (json-readtable-error 'json-read-error))))))
          (cond
           ((string-match-p "200 OK" http-msg)
-            (list (string-trim (map-nested-elt response '(:choices 0 :message :content)))
+            (list (string-trim (gptel--parse-response gptel-backend response
+                                             '(:buffer response-buffer)))
                   http-msg))
           ((plist-get response :error)
            (let* ((error-plist (plist-get response :error))
@ -837,7 +867,7 @@ buffer created or switched to."
                         (read-string "Session name: " (generate-new-buffer-name gptel-default-session))
                       gptel-default-session)
                     (condition-case nil
-                         (gptel--api-key)
+                         (gptel--get-api-key)
                       ((error user-error)
                        (setq gptel-api-key
                              (read-passwd "OpenAI API key: "))))