<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Agents · ArchWorks</title><link>https://archworks.co/tags/agents/</link><description/><language>en</language><lastBuildDate>Sun, 07 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://archworks.co/tags/agents/index.xml" rel="self" type="application/rss+xml"/><item><title>A self-hosted multi-agent LLM stack</title><link>https://archworks.co/docs/self-hosted-llm-stack/</link><pubDate>Sun, 07 Jun 2026 00:00:00 +0000</pubDate><guid isPermaLink="true">https://archworks.co/docs/self-hosted-llm-stack/</guid><description>The full writeup: a GPU host running llama.cpp + llama-swap behind a gateway, the OpenCode agent runtime on top, the single-slot constraint and the agents built around it, the five subagent rules, three-layer skills, a memory layer that learns, and the serving-optimization methodology that multiplied throughput on the same hardware.</description></item></channel></rss>