A self-hosted multi-agent LLM stack
The full writeup: a GPU host running llama.cpp + llama-swap behind a gateway, the OpenCode agent runtime on top, the single-slot constraint and the agents built around it, the five subagent rules, three-layer skills, a …