From Zero to Multi-Agent in 12 Hours

The Problem

My AI agent — Qwen Code running a 27B model on an RTX 3090 — is stuck on my LAN. No public IPv4, carrier-grade NAT from the Belgian ISP. The agent is a prisoner of the local network.

Goal: make the agent accessible over 5G from a phone. By the end of the day, I also wanted a frontier agent on a remote server, and both agents talking to each other. I didn't know that last part yet.

The Timeline

MorningThe VPS

Found Contabo after skipping it four times in sponsored Google results while searching for OVH and Hetzner. Subscribed to their Cloud VPS 10: AMD EPYC Zen 4, 4 vCPU, 8 GB RAM, 75 GB NVMe, 200 Mbps — 4.05€/month prepaid 6 months (29.40€ total). Datacenter assigned in France, about 450 km from home, ~20ms latency.

For comparison, OVH was offering Sandy Bridge (2011 silicon) at 7€+/month. Contabo: EPYC Zen 4 (2023 silicon) at 4€/month. The market has shifted and OVH hasn't noticed.

MorningHardening

Fresh Debian install, hardened from scratch: ed25519 SSH keys only (password auth disabled), root login disabled, firewall locked down to SSH and VPN traffic only. Standard stuff, but it matters — this box will be exposed to the internet permanently.

AfternoonWireGuard Mesh

Hub-and-spoke topology deployed:

Phone (mobile) ——5G——> VPS (France)  <——WG—— Home gateway
                            hub/relay                  |
                                                  Local workstation
                                                  Qwen Code / RTX 3090

The VPS acts as a WireGuard relay — it's the only node with a public IP, so everything routes through it. The phone, the home gateway, and the workstation are all peers on the same VPN mesh.

This took longer than expected. DNS resolution from VPN peers needed reconfiguration. Return routes from the workstation to VPN peers were missing. The phone's private DNS settings conflicted with the VPN config. Each problem was individually trivial but they compound — four hours of debugging for what should have been a 30-minute setup. That's networking.

Result: first SSH from the phone over 5G into the local agent. The LAN prison is broken.

AfternoonThe Frontier Agent

Installed Claude Code on the VPS. Node.js 22, latest Claude Code release. Suddenly the VPS isn't just a VPN relay — it's an execution environment for a frontier agent. From the phone, one alias reaches the local 27B agent, another reaches Claude Sonnet/Opus on the VPS. Both behind the same VPN.

Then the realization: if I can SSH from the phone to the VPS to reach Claude Code, the local agent can do the same thing.

AfternoonThe Discovery

Why build a multi-agent orchestration framework when SSH + tmux already exist?

The protocol is almost embarrassingly simple. The local agent sends a prompt to the frontier agent by typing into its tmux session via SSH. It reads the response by capturing the tmux pane. It confirms tool calls by sending Enter. That's the entire "framework."

No SDK, no message format, no JSON serialization. One agent types text into another agent's terminal and reads what appears. Exactly like a human would.

EveningThe Test

My local Qwen 27B agent, given the instruction to delegate a complex question to a frontier model, autonomously:

Opened a new tmux session with Claude Code on the VPS
Switched the model from Sonnet to Opus via the interactive menu
Asked a complex technical question (Transformer vs State Space Model architectures)
Waited for the response, captured the terminal buffer
Parsed and summarized the result

A local 27B model self-allocating frontier compute when it judges the task exceeds its own capacity. The routing is an agent decision, not a static configuration.

EveningValidation

Systematic testing of the delegation protocol revealed useful patterns:

Discovery	Result
Tool call confirmation	Bare `Enter` is sufficient
Model switching	Direct command works (no menu navigation needed)
End-of-response signal	Prompt character in last line of buffer
Simple question (Sonnet)	~15–18s
Simple question (Opus)	~3s
Multi-step task (Sonnet)	~210–220s
Multi-step task (Opus)	~150–160s

EveningAutonomous Judgment

The last test was the most interesting. The agent needed to set up VPS backups. It analyzed the problem, then decided on its own not to escalate to the frontier agent — backup scripting was within its competence. It correctly identified the cases where it should escalate (disaster recovery planning, tool selection for complex scenarios) and the cases where it shouldn't (writing the actual script).

Dynamic escalation isn't just about capability. It's about judgment. The agent needs to know what it doesn't know.

Agent Session Stats

12h 24m

Wall time

2h 08m

Agent active time

261

Tool calls (97.3% success)

18.8M

Input tokens processed

95.5%

Cache hit rate

Context compactions

18.8 million tokens processed in a single session with only 2 compactions — possible because the model uses a hybrid architecture (33% attention / 67% SSM), which is brutally memory-efficient compared to pure Transformer at the same scale. 131K context window in 24 GB VRAM.

The Final Architecture

┌─────────────────┐        ┌──────────────────────┐        ┌────────────────────────┐
│ Phone (mobile)  │        │ VPS (France)         │        │ Home LAN               │
│ Termux          │──5G──>│ WireGuard hub        │<──WG──│ Gateway / DNS          │
│ WireGuard       │        │ Claude Code          │        │         │              │
│                 │        │ (Sonnet / Opus)      │        │         ▼              │
│ Two aliases:    │        │                      │        │ ┌────────────────────┐ │
│  local agent    │        │                      │        │ │ Workstation        │ │
│  frontier agent │        │                      │        │ │ CachyOS            │ │
└─────────────────┘        └──────────────────────┘        │ │ Qwen Code 27B      │ │
                                                           │ │ llama-server       │ │
                                                           │ │ RTX 3090 24GB      │ │
                                                           │ │ ~30 tools          │ │
                                                           │ └────────────────────┘ │
                                                           └────────────────────────┘

Three access patterns from the phone: SSH directly to the local agent, SSH to the frontier agent on the VPS, or let the local agent delegate to the frontier agent autonomously. All through the same WireGuard mesh.

Total Cost

Item	Cost
VPS (6 months prepaid)	29.40€
WireGuard	0€
Claude Code	0€ (existing Pro sub)
Qwen Code	0€ (Apache 2.0)
llama-server	0€ (compiled from source)
Qwen 27B model	0€ (Apache 2.0)
MAS "framework"	0€ (SSH + tmux)
Total	29.40€ + electricity

What I Learned

1. MAS over Unix primitives works

SSH (1995), tmux (2007), WireGuard (2018) — three primitives that replace LangGraph, CrewAI, AutoGen, and n8n combined. The terminal is the API. tmux is the SDK. I wrote a full piece on this.

2. The "Code" naming is reductive

Claude Code, Qwen Code, Gemini CLI — these are full autonomous agents, not programming assistants. Today, these tools did sysadmin, networking, documentation, and architectural analysis. Zero lines of application code were written.

3. Dense 20B+ is the floor for inter-agent piloting

The frontier delegation pattern requires a minimum of cognitive depth — meta-cognitive judgment, situational parsing, multi-step planning. A 7B can't do it reliably. A 27B dense model with the right stack (memory, skills, epistemic system prompt) handles it naturally.

4. The hybrid SSM + Transformer architecture changes everything

18.8M tokens in a single session, 2 compactions, 131K context in 24 GB VRAM — impossible with a pure Transformer of the same size. The 33% attention / 67% SSM split is brutally efficient on memory. This is what makes a 27B model viable as an all-day agent on consumer hardware.

5. Sovereignty isn't all-or-nothing

Local inference, local memory, filtered DNS, private VPN — critical data stays under control. Gmail and YouTube stay with Google. Pragmatism beats purism. The goal isn't to eliminate all external dependencies. It's to control the ones that matter.

April 12, 2026 — Zwevegem to France to the world. One Sunday, two St. Feuillien Grand Cru, a non-coder, and tools that are 30 years old.