cc-ci-orchestrator/AGENTS.md

# cc-ci-orchestrator — AGENTS.md

This folder is the **orchestrator** workspace for building the **cc-ci** Co-op Cloud recipe CI
server. It holds the plan, the launch/supervision tooling, and the two loop prompts. The actual CI
project (NixOS config, test runner, recipe tests) lives in a **separate** repo the loops create at
`git.autonomic.zone/recipe-maintainers/cc-ci` — do not confuse the two.

## The agent map lives in `cc-ci-plan/orchestration.md`

That doc is the **root structure**: every agent (Builder, Adversary, Orchestrator, Assistant,
Upgrader) → its prompt + plan, plus the watchdog and the git coordination protocol. Read it first.

In short: the **Orchestrator** (*this* session/role) supervises and keeps everyone on track — it is
**separate** from the loops and is the only role that should power-cycle/recreate the host. The
**Builder** and **Adversary** loops coordinate **only** through the cc-ci git repo (`plan.md` §6.1);
the orchestrator watches from outside.

## On startup: read the journal, announce yourself, report reboots

**Every time you (the orchestrator) start or resume:**
1. **Read `cc-ci-plan/JOURNAL.md`** — the most recent `## Session` entry is where the previous
   session left off. This is the persistent handoff record; read it before anything else.
2. Read `cc-ci-plan/REBOOTS.md` (count entries) and run `cc-ci-plan/launch.sh status`
   (current phase + whether loops/watchdog are running).
3. **`PushNotification`** (proactive): *"cc-ci orchestrator online — phase X, loops+watchdog
   running; N reboots logged (last <date>)."*
4. If loops are down, relaunch: `RESUME_PHASE=1 cc-ci-plan/launch.sh start`.
5. **On handoff / end of session:** append a `## Session` block to `JOURNAL.md` summarising
   what happened, current state, and open items (see format in that file).

Reboot resilience is handled by **`cc-ci-loops.service`** (system unit): on boot it logs the reboot
to `REBOOTS.md` (boot_id-gated) and runs `launch.sh start` with `RESUME_PHASE=1`, so the loops +
watchdog auto-resume the saved phase. The orchestrator session itself is NOT auto-started — the
operator reconnects to it (that's why the startup notification matters). The orchestrator now runs on
a **Hetzner `cpx22`** cloud server (`cc-ci-orchestrator-1`, tailnet `100.84.190.30`, public
`168.119.126.100`, flake host `cc-ci-orchestrator-hetzner`) — see
`cc-ci-plan/plan-orchestrator-hetzner-migration.md`. The earlier Pi→Incus-VM move is the historical
`cc-ci-plan/plan-orchestrator-migration.md`. Rebuild this host with
`nixos-rebuild switch --flake .#cc-ci-orchestrator-hetzner` from `/srv/cc-ci-orch`.

## Keep the orchestrator open, under remote-control

Run this session as a long-lived **interactive** session with `--remote-control` so the operator can
check in on the loops and steer/restart things from **claude.ai/code** (or the Claude mobile app)
without being at the terminal.

- **Already in the session?** Just run `/remote-control` — it attaches claude.ai/code to the live
  conversation (no exit, no resume needed).
- **Starting fresh:** `claude --remote-control 'autonomous-orchestrator' --dangerously-skip-permissions`
- **Resuming this orchestrator later (history preserved):**
  ```bash
  claude --resume autonomous-orchestrator --remote-control "autonomous-orchestrator" --dangerously-skip-permissions
  ```
  Note the two names are different: `--resume <name|id>` restores *this conversation* (the name set
  via `-n/--name`, shown in the `/resume` picker); the `--remote-control [name]` value is only the
  web display label and resumes nothing. The conversation persists on disk across exits; remote
  control itself only stays "connected" while the local process is alive (resume + re-enable to get
  it back after a full exit).

Use it to: tail loop logs (`cc-ci-plan/launch.sh logs builder|adversary|watchdog`), inspect
`STATUS.md`/`REVIEW.md` in the cc-ci repo, edit the plan or prompts, restart a stuck loop, or
power-cycle/recreate the cc-ci VM (see `cc-ci-plan/kickoff.md` → "Fallback: restart/recreate the
cc-ci VM"). The orchestrator is the human's steering wheel; the loops are the engine.

## Launch & supervise the loops

- **Source of truth for the loops:** `cc-ci-plan/plan.md` (mission, Definition of Done, §1.5
  credential map, §6 two-agent protocol, §7 loop discipline).
- **Launch/supervision guide:** `cc-ci-plan/kickoff.md`.
- `cc-ci-plan/launch.sh start` → both loops (interactive `--remote-control` in tmux) + a watchdog.
  tmux is installed; `launch.sh` defaults now point at `/srv/cc-ci/...`.

## Access & credentials (pointers only — values are gitignored)

- `.testenv` (**NOT committed**): Tailscale auth key + Gitea bot creds. Load with
  `set -a; . .testenv; set +a` (never echo the values).
- **cc-ci:** `ssh cc-ci` (root) directly — the orchestrator VM is a direct tailnet peer (`100.90.116.4`).
  No proxy. Key: `~/.ssh/cc-ci-root-ed25519`. If unreachable, check `tailscale status`.
- **Incus/VM fallback:** mTLS certs at `/srv/incus-terraform-nix-vm-creator/terraform-secrets/`;
  b1 is on the same tailnet (reach via the same proxy). See kickoff "Fallback".
- **Full credential map + how to use each:** `plan.md` §1.5.

## Hard rule

Never commit secret values. `.testenv`, `*.tfstate`, `*.key`/`*.pem`, and the loop runtime/clone
dirs are gitignored. Reference secret *locations*, never their contents (`plan.md` §9).

## Commit discipline

When the orchestrator, Builder, or assistant makes intentional repository changes here, commit them
promptly and push them to `git.autonomic.zone` in append-only fashion (never force-push). Match the
existing commit author and message style in this repo. Do not bundle unrelated worktree changes you
did not make; stage only the intended files.