Now the workspace is staged on the Hetzner cpx22 (server 134487234, public 91.98.47.73, tailnet cc-ci-orchestrator-1 @ 100.84.190.30): - configuration.nix: enable cc-ci-loops.service (wantedBy multi-user.target) so the loops + watchdog auto-resume on boot; wire reboot-log.sh as ExecStartPre so reboots auto-log to REBOOTS.md (boot_id-gated). - plan-orchestrator-hetzner-migration.md: full migration record. - REBOOTS.md / AGENTS.md: point the orchestrator host at Hetzner; first auto-logged reboot line. - launch-orchestrator.sh: default session id -> the Hetzner orchestrator session. - flake.lock: pin inputs. Verified: nixos-rebuild switch applied; systemctl is-enabled cc-ci-loops.service = enabled; ExecStartPre logged this boot to REBOOTS.md; loops healthy on phase 2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5.4 KiB
cc-ci-orchestrator — AGENTS.md
This folder is the orchestrator workspace for building the cc-ci Co-op Cloud recipe CI
server. It holds the plan, the launch/supervision tooling, and the two loop prompts. The actual CI
project (NixOS config, test runner, recipe tests) lives in a separate repo the loops create at
git.autonomic.zone/recipe-maintainers/cc-ci — do not confuse the two.
Three roles (don't conflate them)
- Orchestrator — this session/role. Supervises: checks in on the two loops, reads their logs/STATUS, makes changes to the plan/prompts, restarts loops, and owns the VM-level fallback. It is separate from the loops and is the only role that should power-cycle/recreate the VM.
- Builder loop — builds the CI server (
cc-ci-plan/prompts/builder.md). - Adversary loop — independently disbelieves/verifies (
cc-ci-plan/prompts/adversary.md).
The two loops coordinate only through the cc-ci git repo (see plan.md §6.1). The orchestrator
watches from outside.
On startup: announce yourself + report reboots
Every time you (the orchestrator) start or resume, send a PushNotification that you are online —
the operator wants to know the supervising session is back (especially after a reboot, which kills
this session). Include the current phase and the reboot count. Steps on startup:
- Read
cc-ci-plan/REBOOTS.md(count the## Rebootsentries) andcc-ci-plan/launch.sh status(current phase + whether the loops/watchdog are running). PushNotification(proactive), e.g.: "cc-ci orchestrator online — phase 2, loops+watchdog running; N reboots logged (last )."- If a reboot happened while you were away (a new line in REBOOTS.md since you last looked, or the
loops are down), check that
cc-ci-loops.servicebrought the loops back; if not, relaunch withRESUME_PHASE=1 cc-ci-plan/launch.sh start.
Reboot resilience is handled by cc-ci-loops.service (system unit): on boot it logs the reboot
to REBOOTS.md (boot_id-gated) and runs launch.sh start with RESUME_PHASE=1, so the loops +
watchdog auto-resume the saved phase. The orchestrator session itself is NOT auto-started — the
operator reconnects to it (that's why the startup notification matters). The orchestrator now runs on
a Hetzner cpx22 cloud server (cc-ci-orchestrator-1, tailnet 100.84.190.30, public
168.119.126.100, flake host cc-ci-orchestrator-hetzner) — see
cc-ci-plan/plan-orchestrator-hetzner-migration.md. The earlier Pi→Incus-VM move is the historical
cc-ci-plan/plan-orchestrator-migration.md. Rebuild this host with
nixos-rebuild switch --flake .#cc-ci-orchestrator-hetzner from /srv/cc-ci-orch.
Keep the orchestrator open, under remote-control
Run this session as a long-lived interactive session with --remote-control so the operator can
check in on the loops and steer/restart things from claude.ai/code (or the Claude mobile app)
without being at the terminal.
- Already in the session? Just run
/remote-control— it attaches claude.ai/code to the live conversation (no exit, no resume needed). - Starting fresh:
claude --remote-control 'autonomous-orchestrator' --dangerously-skip-permissions - Resuming this orchestrator later (history preserved):
Note the two names are different:
claude --resume autonomous-orchestrator --remote-control "autonomous-orchestrator" --dangerously-skip-permissions--resume <name|id>restores this conversation (the name set via-n/--name, shown in the/resumepicker); the--remote-control [name]value is only the web display label and resumes nothing. The conversation persists on disk across exits; remote control itself only stays "connected" while the local process is alive (resume + re-enable to get it back after a full exit).
Use it to: tail loop logs (cc-ci-plan/launch.sh logs builder|adversary|watchdog), inspect
STATUS.md/REVIEW.md in the cc-ci repo, edit the plan or prompts, restart a stuck loop, or
power-cycle/recreate the cc-ci VM (see cc-ci-plan/kickoff.md → "Fallback: restart/recreate the
cc-ci VM"). The orchestrator is the human's steering wheel; the loops are the engine.
Launch & supervise the loops
- Source of truth for the loops:
cc-ci-plan/plan.md(mission, Definition of Done, §1.5 credential map, §6 two-agent protocol, §7 loop discipline). - Launch/supervision guide:
cc-ci-plan/kickoff.md. cc-ci-plan/launch.sh start→ both loops (interactive--remote-controlin tmux) + a watchdog. tmux is installed;launch.shdefaults now point at/srv/cc-ci/....
Access & credentials (pointers only — values are gitignored)
.testenv(NOT committed): Tailscale auth key + Gitea bot creds. Load withset -a; . .testenv; set +a(never echo the values).- cc-ci:
ssh cc-ci(root) directly — the orchestrator VM is a direct tailnet peer (100.90.116.4). No proxy. Key:~/.ssh/cc-ci-root-ed25519. If unreachable, checktailscale status. - Incus/VM fallback: mTLS certs at
/srv/incus-terraform-nix-vm-creator/terraform-secrets/; b1 is on the same tailnet (reach via the same proxy). See kickoff "Fallback". - Full credential map + how to use each:
plan.md§1.5.
Hard rule
Never commit secret values. .testenv, *.tfstate, *.key/*.pem, and the loop runtime/clone
dirs are gitignored. Reference secret locations, never their contents (plan.md §9).