Update llama_proxy_man/README.md

2024-10-08 15:37:58 +00:00 · 2024-10-08 15:37:58 +00:00 · 734a6300a1
commit 734a6300a1
parent 66cf52e2ce
1 changed files with 5 additions and 0 deletions
--- a/llama_proxy_man/README.md
+++ b/llama_proxy_man/README.md
@ -3,3 +3,8 @@
 - manages multiple llama.cpp instances in the background
 - keeps track of used & available video & cpu memory
 - starts/stops llama.cpp instances as needed, to ensure memory limit is never reached
+
+## Ideas
+
+- smarter logic to decide what to stop
+- unified api, with proxying by model_name param for stamdartized `/v1/chat/completions` and `/completion` like endpoints