redvault-ai/llama_proxy_man
Tristan Druyen bf6caabfe8
feat: Embedded proxy_man for forge
- Add `figment` for config yamls
- Small `Makefile.toml` fixes ? (docset seems still broken ??)
- Copy `config.yaml` workspace & forge
- Embed proxy_man in forge
- Remove `backend_process.rs` and `process.rs`
- Update `llama_proxy_man/Cargo.toml` and `config.rs` for new dependencies
- Format
2025-02-11 04:22:14 +01:00
..
src feat: Embedded proxy_man for forge 2025-02-11 04:22:14 +01:00
Cargo.toml feat: Embedded proxy_man for forge 2025-02-11 04:22:14 +01:00
config.yaml Run leptosfmt 2025-02-11 01:02:28 +01:00
README.md Update llama_proxy_man/README.md 2024-10-08 15:37:58 +00:00
TODO.org Add llama_proxy_man pkg 2024-09-19 17:21:46 +02:00

LLama Herder

  • manages multiple llama.cpp instances in the background
  • keeps track of used & available video & cpu memory
  • starts/stops llama.cpp instances as needed, to ensure memory limit is never reached

Ideas

  • smarter logic to decide what to stop
  • unified api, with proxying by model_name param for stamdartized /v1/chat/completions and /completion like endpoints