Multi-agent orchestration with Multica, OpenClaw, and Hermes
Run multiple AI runtimes — OpenClaw, Hermes, and Claude Code — on one Ollama VM with serialised local inference via agents.defaults.maxConcurrent: 1, an issue-board orchestrator (Multica), and a workflow for inviting collaborators into the same workspace.
- Step 1
Prerequisites
This guide assumes you've already finished Self-host OpenClaw with HTTPS, Brave search, and GitHub access. That guide covers the underlying setup this one builds on. If you're picking up mid-stack, scan its 'Overview' step to confirm your environment matches.
- Step 2
Multica — AI Orchestration Platform
Multica is an open-source (MIT) AI orchestration platform that treats AI agents as teammates with a shared issue board. It supports OpenClaw, Hermes, Claude Code, Codex, and other runtimes. Agents from multiple machines can collaborate on tasks assigned via the board.
- Step 3
Architecture
Three Docker containers run on the Ollama VM (192.168.20.110):
- multica-frontend-1 — web UI (port 3000)
- multica-backend-1 — API and WebSocket server (port 8080)
- multica-postgres-1 — PostgreSQL 17 with pgvector nginx proxies HTTPS on port 3443, reusing the OpenClaw self-signed certificate:
- /api/daemon/ws and /ws — backend with WebSocket upgrade headers
- /api and /auth — backend (port 8080)
- / — frontend (port 3000)
- Step 4
Installation
Config file: /home/ollama/.multica/server/.env CLI binary: /home/ollama/.local/bin/multica (keep updated — run
multica updateperiodically; v0.3.8+ fixes a stale capacity-counter bug that prevents task dispatch) Systemd service: /etc/systemd/system/multica.service (starts on boot)# Deploy via Docker Compose docker compose -f docker-compose.selfhost.yml up -d - Step 5
Environment Config (.env key values)
Environment Config (.env key values).
MULTICA_SERVER_URL=wss://192.168.20.110:3443/ws MULTICA_APP_URL=https://192.168.20.110:3443 ALLOWED_ORIGINS=https://192.168.20.110:3443 FRONTEND_ORIGIN=https://192.168.20.110:3443 ALLOW_SIGNUP=true - Step 6
Login (No Email Provider)
No email provider is configured — login codes print directly to the backend container logs:
docker logs multica-backend-1 2>&1 | grep '[DEV]' | tail -5 - Step 7
Unifi Firewall Rule
- Source zone: VPN (10.99.99.0/24)
- Destination: 192.168.20.110, port 3443
- Action: Allow + Auto Return Traffic
- Step 8
VM Multica Daemon (OpenClaw Runtime)
The Multica daemon runs on the Ollama VM as a user systemd service and registers all three runtimes (OpenClaw, Hermes, and Claude Code) automatically on startup: Service file: ~/.config/systemd/user/multica-daemon.service Authenticated via personal access token. OPENCLAW_GATEWAY_TOKEN is set in the service environment. Uses a WebSocket connection for real-time task dispatch (not polling).
~/.local/bin/multica daemon status ~/.local/bin/multica daemon logs ~/.local/bin/multica daemon restart - Step 9
Runtime Visibility
By default, runtimes are private (owner-only). To make a runtime visible workspace-wide, set its visibility in the database: Alternatively use the Multica UI once that feature is exposed. Each user should also create Agents from their runtimes in Settings → Agents so they are assignable to tasks by all workspace members.
UPDATE agent_runtime SET visibility = 'public' WHERE owner_id = '<user_id>'; - Step 10
Multica Agent Setup
Runtimes registered by the Multica daemon are not automatically available as assignable agents in the workspace. Agents must be created manually via the Multica UI and given workspace-level visibility.
- Step 11
Creating Agents via the UI
Go to https://192.168.20.110:3443 → Settings → Agents → New Agent for each runtime:
- Name: e.g. "OpenClaw"
- Runtime: select Openclaw (Ollama VM)
- Visibility: Workspace (so both you and your collaborator can assign tasks)
- Save Repeat for your collaborator's runtimes once their daemon is online: Hermes, Claude, and Opencode (collaborator-mac.local).
- Step 12
If the Runtime Does Not Appear
If the UI does not show the runtime as an option, the agent can be created directly via the API:
# Get the runtime ID first curl -s https://192.168.20.110:3443/api/runtimes \ -H "Authorization: Bearer <token>" | jq . # Create the agent curl -X POST https://192.168.20.110:3443/api/agents \ -H "Authorization: Bearer <token>" \ -H "Content-Type: application/json" \ -d '{"name":"OpenClaw","runtime_id":"<id>","visibility":"public"}' - Step 13
Assigning Tasks
Once agents are created, the workflow is:
- Create an issue in the Multica board (e.g. Summer Fun project)
- Assign it to the OpenClaw agent
- OpenClaw will clone the repo, work on the task, and submit a PR Note: Each workspace member needs to create Agents from their own runtimes. Visibility must be set to Workspace (not Private) for cross-member assignment to work.
- Step 14
Hermes on the Ollama VM
Hermes (by Nous Research) runs alongside OpenClaw on the Ollama VM as a second Multica agent runtime. It handles general reasoning and research tasks using a self-improving skills loop, complementing OpenClaw's coding and tool-use focus. Both runtimes are managed by the single Multica daemon already running on the VM.
- Step 15
Model Assignment
Each runtime is pointed at a different model on the same Ollama instance: Runtime Model Context VRAM Task focus Hermes 64K 17 GB General reasoning, research, long-context analysis OpenClaw 64K 18 GB Coding, tool use, file ops, Multica project management Note: Both models cannot be resident simultaneously (17+18 GB exceeds 24 GB VRAM). Ollama handles eviction automatically. OLLAMA_KEEP_ALIVE is set to 20m so the active runtime releases VRAM quickly when a different agent needs to load.
qwen3.5:27b qwen3-coder:30b - Step 16
Installation
Run the official Hermes install script. It installs to
~/.hermesand creates a Python virtualenv that holds the agent and its dependencies.curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash - Step 17
Configuration
~/.hermes/config.yaml: Note: Use provider: "custom" (not "ollama") for local Ollama in Hermes. The "ollama" provider value is not recognised and causes Hermes to fail to connect. base_url and api_key must also be set explicitly. ~/.hermes/.env:
model: default: "qwen3.5:27b" provider: "custom" OLLAMA_BASE_URL=http://127.0.0.1:11434/v1 - Step 18
Multica Daemon
The existing multica daemon service (~/.config/systemd/user/multica-daemon.service) manages all three runtimes. The --max-concurrent-tasks 1 flag queues tasks serially to prevent VRAM contention between agents. Full ExecStart and key env vars: The daemon auto-detects all three CLIs (openclaw, hermes, claude) and registers them as separate runtimes on startup. No additional daemon is needed.
Environment=MULTICA_HERMES_PATH=/home/ollama/.local/bin/hermes - Step 19
Multica Agent
Create a "Hermes (Ollama VM)" agent in Multica:
- Settings → Agents → New Agent
- Runtime: Hermes (Ollama VM)
- Model: ollama/qwen3.5:27b
- Visibility: Workspace
- Step 20
Task Routing
- Hermes (Ollama VM) → general reasoning, research, long-context analysis, self-improving skill accumulation
- OpenClaw → coding, tool use, file ops, Multica project management Note: Hermes builds a reusable skills library automatically from task experience — a capability OpenClaw does not have. Over time, assigning research and analysis tasks to Hermes compounds value as it learns project-specific patterns.
- Step 21
VM Agent Stack
Three agents now run on the Ollama VM, all managed by the single Multica daemon (~/.config/systemd/user/multica-daemon.service): Agent Provider Model Best for OpenClaw openclaw Coding, tool use, file ops, Multica project management Hermes (Ollama VM) hermes General reasoning, research, long-context, self-improving skills loop Claude (Ollama VM) claude Tasks that exceed local model capability
qwen3-coder:30b qwen3.5:27b claude-sonnet-4-5 - Step 22
Claude Code CLI
Claude Code is installed on the VM and authenticated with the Anthropic API key via claude auth. It is auto-detected by the Multica daemon as provider=claude and appears as the "Claude (Ollama VM)" runtime. Note: Claude Code talks directly to the Anthropic API — no Ollama involvement. It is used explicitly via the Claude (Ollama VM) Multica agent for tasks that exceed local model capability, not as a silent fallback.
# Install Claude Code CLI npm install -g @anthropic-ai/claude-code # Authenticate claude auth - Step 23
Skip the Anthropic fallback inside OpenClaw
Keep OpenClaw on local models only. Claude is accessed explicitly via the dedicated
Claude (Ollama VM)Multica agent — not as a silent fallback inside OpenClaw. Configureopenclaw.jsonaccordingly:- Leave Anthropic models (e.g.
anthropic/claude-sonnet-4-6) out of every fallback chain - Disable the Anthropic plugin:
plugins.entries.anthropic.enabled = false - Don't define an
anthropic:defaultauth profile - Main agent fallback chain is local only:
qwen3-coder:30b → qwen3.5:27b - Set
agents.defaults.maxConcurrent: 1to prevent concurrent GPU access between agents
- Leave Anthropic models (e.g.
- Step 24
Multica Daemon — Key Environment Variables
~/.config/systemd/user/multica-daemon.service key env vars: The daemon auto-detects all three CLIs (openclaw, hermes, claude) from PATH on startup and registers them as separate runtimes.
MULTICA_AGENT_TIMEOUT=6h— Per-task execution limit sent to the server when a task is claimed. The default is 2h, which is too short for local models (Hermes/OpenClaw tasks regularly take 30-45+ minutes each). The server also uses this value as the queue expiry window, so raising it gives later queued tasks a wider window to be claimed before they expire. 6h is a reasonable ceiling for the slowest local model tasks.Warning: OPENCLAW_GATEWAY_TOKEN must exactly match
gateway.auth.tokenin ~/.openclaw/openclaw.json. A mismatch causes the daemon to receive 401 on every dispatch attempt and silently report "at capacity" — tasks will never start. Verify with:grep OPENCLAW_GATEWAY_TOKEN ~/.config/systemd/user/multica-daemon.service | sed 's/.*=//' python3 -c "import json; d=json.load(open('/home/ollama/.openclaw/openclaw.json')); print(d['gateway']['auth']['token'])"The two lines must be identical. If they differ, fix the service file, then run
systemctl --user daemon-reload && systemctl --user restart multica-daemon.ExecStart=/home/ollama/.local/bin/multica daemon start --foreground --max-concurrent-tasks 1 --device-name 'Ollama VM' --runtime-name 'Ollama VM' Environment=MULTICA_HERMES_PATH=/home/ollama/.local/bin/hermes Environment=OPENCLAW_GATEWAY_TOKEN=<token> # must match gateway.auth.token in openclaw.json exactly Environment=MULTICA_AGENT_TIMEOUT=6h # raise from 2h default; local model tasks take 30-45+ min each - Step 25
Collaborative Setup — Friend Access
A collaborator can be granted access to the Multica workspace and the Ollama VM via WireGuard VPN. Their runtimes (Hermes, Claude Code, Opencode) appear alongside yours on the shared Multica board.
- Step 26
Adding a Collaborator
Invite their email via the Multica UI (workspace members). Because no email provider is configured, their invitation link and verification code both print to the backend container logs: The collaborator must accept the invite link first, then sign in using the verification code from the same log output.
docker logs multica-backend-1 2>&1 | grep '\[DEV\]' | tail -5 - Step 27
WireGuard Peer Config (collaborator-mac)
- VPN IP: 10.99.99.4
- client_allowed_ips: 192.168.20.110/24 — restricts the peer to the Ollama VM only, not the full LAN
- Step 28
TLS Certificate Trust on Friend's Mac
The self-signed cert must be trusted on the collaborator’s Mac before the CLI or desktop app will connect. The cert is at /etc/ssl/certs/openclaw.crt on the Ollama VM. Copy it to the Mac and trust it:
# On the VM — copy the cert content cat /etc/ssl/certs/openclaw.crt # Save output to /tmp/openclaw.crt on the Mac, then trust it: sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain openclaw.crt - Step 29
Collaborator's Multica CLI Setup
Collaborator's Multica CLI Setup.
brew install multica-ai/tap/multica multica config set server_url https://192.168.20.110:3443 multica config set app_url https://192.168.20.110:3443 multica login multica daemon start - Step 30
Collaborator's Runtimes
All set to visibility: public in the database so they are assignable to tasks workspace-wide:
- Hermes (collaborator-mac.local)
- Claude (collaborator-mac.local)
- Opencode (collaborator-mac.local) Note: Each collaborator needs to set their runtimes to public visibility and create Agents in Settings → Agents. The DB UPDATE command or UI toggle applies per user.
- Step 31
Desktop App Config (Critical)
The Multica desktop app has its own server config independent of the CLI. Without this, the app hits Multica cloud instead of the local instance regardless of the CLI’s server_url setting. On the collaborator’s machine, create ~/.multica/desktop.json: Note: With desktop.json pointing at the local instance, the app cannot access Multica cloud workspaces. To switch back to cloud, remove or empty desktop.json and re-authenticate. Reference: GitHub issue #2103 / desktop app docs.
{ "schemaVersion": 1, "apiUrl": "https://192.168.20.110:3443" } - Step 32
Diagnosing App vs CLI Mismatch
If a collaborator’s CLI shows the workspace but the desktop app does not, the app is authenticating against a different server. To confirm: request a login code in the app and check whether a new code appears in docker logs. If no new code appears, the app is not hitting the local instance — desktop.json is missing or incorrect. Run this after the login attempt. If a fresh code appears, the app is reaching the server. If not, fix desktop.json and restart the app.
docker logs multica-backend-1 2>&1 | grep '\[DEV\]' | tail -5 - Step 33
IP and Access
- Proxmox WebUI: https://192.168.20.6:8006
- Proxmox SSH (laptop): ssh homelab (claude-code user, key: ~/.ssh/claude_homelab)
- Ollama VM SSH (laptop): ssh ollama@192.168.20.110
- Ollama VM SSH (from Proxmox): ssh ollama-vm (as claude-code user, key: ~/.ssh/ollama_vm) OpenClaw Dashboard Multica Board WireGuard server
https://192.168.20.110 (WireGuard VPN required; accept self-signed cert on first visit) https://192.168.20.110:3443 (WireGuard VPN required; accept self-signed cert on first visit) 192.168.10.18 (HAOS Pi 4) - Step 34
Common Operations
Grab-bag of the most-used commands across the stack — VM management, storage, GPU, Ollama, security, OpenClaw, Multica, and logs.
# VM management qm start 100 / qm shutdown 100 / qm status 100 qm listsnapshot 100 # Storage pvesm status df -h /mnt/backups /mnt/models # GPU (on Proxmox host) lspci -nnk -d 10de:2204 # verify vfio-pci binding # GPU (inside VM) nvidia-smi watch -n1 nvidia-smi # live GPU utilisation # Ollama (inside VM) ollama list ollama pull <model> ollama run <model> '<prompt>' systemctl status ollama # Security fail2ban-client status fail2ban-client status sshd fail2ban-client status proxmox # OpenClaw (inside VM) openclaw gateway status openclaw channels status openclaw logs --follow openclaw gateway restart # Multica daemon multica daemon status multica daemon logs journalctl --user -u multica-daemon -n 50 --no-pager systemctl --user restart multica-daemon multica update && systemctl --user restart multica-daemon # Multica issues multica issue list --output json | python3 -c "import json,sys; d=json.load(sys.stdin); [print(i['identifier'],i['status'],i['title'][:50]) for i in d.get('issues',[])]" multica issue rerun <issue-id> # re-queue a stuck todo issue # Automated requeue timer (re-queues all todo issues every 90 min) systemctl --user status multica-requeue.timer systemctl --user list-timers multica-requeue journalctl --user -u multica-requeue -n 20 --no-pager # Manually trigger a requeue run systemctl --user start multica-requeue.service # Logs journalctl -u fail2ban --no-pager | tail -20 journalctl -u ollama --no-pager | tail -20 journalctl --user -u openclaw-gateway.service -n 50 --no-pager - Step 35
File Locations
- GPU blacklist: /etc/modprobe.d/blacklist-gpu.conf
- vfio-pci config: /etc/modprobe.d/vfio.conf
- Ollama service override: /etc/systemd/system/ollama.service.d/override.conf
- Ollama KV cache config: /etc/systemd/system/ollama.service.d/kv-cache.conf
- Ollama performance config: /etc/systemd/system/ollama.service.d/performance.conf
- nginx TLS config: /etc/nginx/sites-available/openclaw
- nginx TLS certificate: /etc/ssl/certs/openclaw.crt
- Fail2ban Proxmox jail: /etc/fail2ban/jail.d/proxmox.conf
- Fail2ban Proxmox filter: /etc/fail2ban/filter.d/proxmox.conf
- PVE apt repos: /etc/apt/sources.list.d/
- Claude Code context: /home/claude-code/CLAUDE.md
- SSH authorized keys (Proxmox): /home/claude-code/.ssh/authorized_keys
- SSH config (Proxmox → VM): /home/claude-code/.ssh/config
- SSH key (laptop → Proxmox): ~/.ssh/claude_homelab (on laptop)
- SSH key (Proxmox → VM): /home/claude-code/.ssh/ollama_vm
- Proxmox storage config: /etc/pve/storage.cfg
- VM config: /etc/pve/qemu-server/100.conf
- OpenClaw config: /home/ollama/.openclaw/openclaw.json
- Main agent orchestration rules: /home/ollama/.openclaw/workspace/main/AGENTS.md
- OpenClaw config backup: /home/ollama/.openclaw/openclaw.json.bak
- OpenClaw service file: /home/ollama/.config/systemd/user/openclaw-gateway.service
- OpenClaw logs: /tmp/openclaw/openclaw-<date>.log
- OpenClaw WhatsApp session: /home/ollama/.openclaw/whatsapp-session/
- OpenClaw gateway token: gateway.auth.token in openclaw.json (store in password manager; must match OPENCLAW_GATEWAY_TOKEN in daemon service exactly)
- Multica daemon service file: ~/.config/systemd/user/multica-daemon.service
- Multica CLI config: ~/.multica/config.json
- Multica server env: ~/.multica/server/.env
- Multica task workspaces: ~/multica_workspaces/
- Multica requeue script: ~/.local/bin/multica-requeue-todo
- Multica requeue service: ~/.config/systemd/user/multica-requeue.service
- Multica requeue timer: ~/.config/systemd/user/multica-requeue.timer (fires every 90 min; re-queues todo issues evicted by queue expiry)
- Step 36
Troubleshooting: Tasks Not Running
If tasks assigned to Hermes, OpenClaw, or Claude stay in
todoand never start, check these three causes in order.1. Wrong OPENCLAW_GATEWAY_TOKEN
The daemon service file (
~/.config/systemd/user/multica-daemon.service) containsOPENCLAW_GATEWAY_TOKEN. It must exactly matchgateway.auth.tokenin~/.openclaw/openclaw.json— a single transposed character causes every dispatch to return 401, which the daemon silently counts as "at capacity", blocking all task execution.Verify they match:
grep OPENCLAW_GATEWAY_TOKEN ~/.config/systemd/user/multica-daemon.service | sed 's/.*=//' python3 -c "import json; d=json.load(open('/home/ollama/.openclaw/openclaw.json')); print(d['gateway']['auth']['token'])"If they differ, correct the service file and restart:
systemctl --user daemon-reload && systemctl --user restart multica-daemon2. Task queue expiry (queued_expired)
Tasks have a 2-hour queue expiry. If the daemon was offline, at capacity, or had a bad token when issues were assigned, the queue entries expire with
failure_reason = 'queued_expired'. Expired entries are not retried automatically — the issue stays intodoforever.Re-queue a single issue:
multica issue rerun <issue-id>Re-queue all todo issues in bulk:
multica issue list --output json --limit 200 | python3 -c " import json, sys, subprocess for i in json.load(sys.stdin).get('issues', []): if i['status'] == 'todo': subprocess.run(['multica', 'issue', 'rerun', i['id']]) print('requeued', i['identifier'], i['title'][:50]) "3. Outdated daemon (stale at-capacity counter)
Daemon versions before v0.3.8 have a bug where the internal running-task counter is not reset on reconnect, causing the daemon to perpetually report "at capacity running=1" even with no tasks in flight. Update and restart to clear it:
multica update && systemctl --user restart multica-daemonConfirm the daemon is dispatching after restart by watching the backend logs — you should see
task claimedandtask startedentries within 30 seconds of re-queuing issues:docker logs multica-backend-1 --follow 2>&1 | grep -E '(task claimed|task started|no task to claim)'4. Local model throughput vs. queue expiry
The server queues all
todotasks simultaneously when issues are dispatched, and each task entry has a fixed expiry window. With--max-concurrent-tasks 1and slow local models (Hermes/OpenClaw tasks take 30-45+ minutes each), only a handful of tasks complete before later queued entries expire — even with the daemon working correctly.Two mitigations, both required:
Raise
MULTICA_AGENT_TIMEOUT— the daemon sends this to the server when claiming a task; the server uses it as the expiry window. The default of 2h allows only ~3-4 local model tasks to complete before the queue window closes for the remaining entries. Set it to 6h in the daemon service file:Environment=MULTICA_AGENT_TIMEOUT=6hThen reload and restart the daemon:
systemctl --user daemon-reload && systemctl --user restart multica-daemonVerify the new timeout is active — the startup log should show
timeout=6h0m0son the first task invocation.Install the automated requeue timer — even with a longer window, a large batch of tasks will eventually exhaust it. The requeue timer re-queues any issue still in
todostate every 90 minutes so evicted tasks drain automatically without manual intervention:# Create the script cat > ~/.local/bin/multica-requeue-todo << 'EOF' #!/usr/bin/env bash set -euo pipefail log() { echo "[$(date -u '+%Y-%m-%dT%H:%M:%SZ')] $*"; } log "Checking for stalled todo issues..." ISSUES=$(multica issue list --output json --limit 200 2>&1) TODO_IDS=$(echo "$ISSUES" | python3 -c " import json, sys data = json.load(sys.stdin) for i in data.get('issues', []): if i['status'] == 'todo': print(i['id'], i['identifier'], i['title'][:60]) ") if [[ -z "$TODO_IDS" ]]; then log "No todo issues."; exit 0; fi COUNT=0 while IFS= read -r line; do ID=$(echo "$line" | awk '{print $1}') LABEL=$(echo "$line" | cut -d' ' -f2-) log "Requeueing $LABEL ..." multica issue rerun "$ID" && log " requeued" || log " skipped (not assigned or already active)" COUNT=$((COUNT + 1)) done <<< "$TODO_IDS" log "Done. Processed $COUNT todo issue(s)." EOF chmod +x ~/.local/bin/multica-requeue-todo # Create the systemd service and timer cat > ~/.config/systemd/user/multica-requeue.service << 'EOF' [Unit] Description=Multica — requeue stalled todo issues After=multica-daemon.service [Service] Type=oneshot ExecStart=/home/ollama/.local/bin/multica-requeue-todo Environment=HOME=/home/ollama Environment=PATH=/home/ollama/.local/bin:/usr/bin:/usr/local/bin:/bin StandardOutput=journal StandardError=journal EOF cat > ~/.config/systemd/user/multica-requeue.timer << 'EOF' [Unit] Description=Multica — requeue stalled todo issues every 90 minutes After=multica-daemon.service [Timer] OnBootSec=2min OnUnitActiveSec=90min Unit=multica-requeue.service [Install] WantedBy=timers.target EOF systemctl --user daemon-reload systemctl --user enable --now multica-requeue.timer systemctl --user list-timers multica-requeueThe timer fires 2 minutes after each boot and then every 90 minutes. Issues that are not assigned to an agent (e.g. manual tasks) are skipped silently —
multica issue rerunreturns 400 for those and the script logs "skipped" without failing. - Step 37
You've reached the end of the series
This is the final guide in the self-hosted multi-agent AI stack. Earlier guides in the series:
Feature requests
Sign in to suggest features or vote on existing ones.
No feature requests yet.
Discussion
Sign in to join the discussion.
No comments yet.