redvault-ai/llama_proxy_man/TODO.org

35 lines
1.2 KiB
Org Mode
Raw Normal View History

2024-09-19 16:49:46 +02:00
#+title: Todo
* Name ideas
- llama herder
- llama herdsman/women/boy ??
- llama shepherd ?
* MVP
- [X] fix stopping (doesn't work correctly at all)
- seems done
* Future Features
- [ ] support for model selection by name on a unified port for /api & /completions
- [ ] separation of proxy/selection stuff ? config for unmanaged instances for auto model-selection by name
- [ ] automatic internal port management (search for free ports)
- [ ] Diagnostic Overview UI/API
- [ ] Config UI/API ?
- [ ] better book-keeping abt inflight requests ? (needed ?)
- [ ] multi node stuff
- how exactly ?
- clustering ? (one manager per node ?)
- ssh support ???
- [ ] automatic ram usage calc ?
- [ ] other runners
- e.g. docker/ run in path etc
- [ ] other backends ?
- [ ] more advanced start/stop behavior
- more config ? e.g. pinning/priorities/prefer-to-kill/start-initially
- LRU /most used prioritized to keep running
- speculative relaunch
- scheduling of how to order in-flight requests + restarts to handle them optimally
- [ ] advanced high-level foo
- automatic context-size selection per request/ start with bigger context if current instance has to low context