#+title: Todo * Name ideas - llama herder - llama herdsman/women/boy ?? - llama shepherd ? * MVP - [X] fix stopping (doesn't work correctly at all) - seems done * Future Features - [ ] support for model selection by name on a unified port for /api & /completions - [ ] separation of proxy/selection stuff ? config for unmanaged instances for auto model-selection by name - [ ] automatic internal port management (search for free ports) - [ ] Diagnostic Overview UI/API - [ ] Config UI/API ? - [ ] better book-keeping abt inflight requests ? (needed ?) - [ ] multi node stuff - how exactly ? - clustering ? (one manager per node ?) - ssh support ??? - [ ] automatic ram usage calc ? - [ ] other runners - e.g. docker/ run in path etc - [ ] other backends ? - [ ] more advanced start/stop behavior - more config ? e.g. pinning/priorities/prefer-to-kill/start-initially - LRU /most used prioritized to keep running - speculative relaunch - scheduling of how to order in-flight requests + restarts to handle them optimally - [ ] advanced high-level foo - automatic context-size selection per request/ start with bigger context if current instance has to low context