From bdc78da9212c9523a49169672ffcf9e9191297d1 Mon Sep 17 00:00:00 2001
From: autonomic-bot <mfowler.email@protonmail.com>
Date: Tue, 26 May 2026 20:46:28 +0100
Subject: [PATCH] Initial commit: cc-ci autonomous orchestrator

Planning + launch + setup material for the cc-ci Co-op Cloud recipe CI server:
plan.md (single source of truth), kickoff/launch supervision, and the
Builder/Adversary loop prompts. Secrets (.testenv) and runtime dirs are gitignored.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .gitignore                      |  12 +
 cc-ci-plan/README.md            |  34 ++
 cc-ci-plan/brief.md             |  35 ++
 cc-ci-plan/kickoff.md           | 143 +++++++
 cc-ci-plan/launch.sh            | 203 ++++++++++
 cc-ci-plan/plan.md              | 635 ++++++++++++++++++++++++++++++++
 cc-ci-plan/prompts/adversary.md |  19 +
 cc-ci-plan/prompts/builder.md   |  25 ++
 references/recipe-maintainer    |   1 +
 9 files changed, 1107 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 cc-ci-plan/README.md
 create mode 100644 cc-ci-plan/brief.md
 create mode 100644 cc-ci-plan/kickoff.md
 create mode 100755 cc-ci-plan/launch.sh
 create mode 100644 cc-ci-plan/plan.md
 create mode 100644 cc-ci-plan/prompts/adversary.md
 create mode 100644 cc-ci-plan/prompts/builder.md
 create mode 120000 references/recipe-maintainer

diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..36fcfc2
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,12 @@
+# Secrets — NEVER commit
+.testenv
+*.tfstate
+*.tfstate.*
+*.key
+*.pem
+
+# Loop runtime / working clones (created at launch by launch.sh)
+/cc-ci/
+/cc-ci-adv/
+/.cc-ci-watch/
+/.cc-ci-logs/
diff --git a/cc-ci-plan/README.md b/cc-ci-plan/README.md
new file mode 100644
index 0000000..d093444
--- /dev/null
+++ b/cc-ci-plan/README.md
@@ -0,0 +1,34 @@
+# cc-ci-plan
+
+Self-contained handoff package for building the **cc-ci** Co-op Cloud recipe CI server with two
+autonomous Claude loops (a Builder and an adversarial Reviewer) running over days.
+
+## Start here
+
+1. Read **`plan.md`** — the full plan and single source of truth (mission, Definition of Done,
+   architecture, milestones, the two-agent coordination protocol, loop discipline).
+2. Read **`kickoff.md`** — how to launch and supervise the loops.
+3. Run **`./launch.sh start`** to bring up both loops + the watchdog.
+
+## Files
+
+| File | Purpose |
+|---|---|
+| `plan.md` | The plan. Agents treat it as their single source of truth. |
+| `brief.md` | The original one-page brief (context only; `plan.md` supersedes it). |
+| `kickoff.md` | Launch & supervision guide. |
+| `launch.sh` | Starts both loops + a watchdog; restarts dead loops; stops on `## DONE`. |
+| `prompts/builder.md` | Builder loop prompt (fed to `claude` by the script). |
+| `prompts/adversary.md` | Adversary loop prompt. |
+
+## Before launching
+
+- Set the org in `plan.md` (`git.autonomic.zone/recipe-maintainers/cc-ci`) and lock the six proof recipes (§8).
+- Ensure the launching shell has: SSH+sudo to `cc-ci`, the Gitea token, `git.autonomic.zone` access.
+- Preconfigure test-app DNS + TLS (plan §4.0): point a wildcard `*.ci.commoninternet.net` record at a gateway that TLS-passthroughs to cc-ci, and **pre-issue the wildcard cert** (`*.ci.commoninternet.net` + `ci.commoninternet.net`, via Gandi DNS-01) into `/var/lib/ci-certs/live/` on cc-ci. The agent handles everything else on cc-ci (Traefik file provider → that cert, swarm, routing) and does **no ACME**; renewal (~90 days) is an out-of-band operator task, so the DNS token never goes to the agent.
+- `export CC_CI_REPO=https://git.autonomic.zone/recipe-maintainers/cc-ci.git` so the watchdog can detect `## DONE`.
+
+## What "done" means
+
+The loops stop only when all of `plan.md` §2 (D1–D10) hold **and** the Adversary has independently
+re-verified each within 24h. The watchdog then tears the loops down automatically.
diff --git a/cc-ci-plan/brief.md b/cc-ci-plan/brief.md
new file mode 100644
index 0000000..ef0883b
--- /dev/null
+++ b/cc-ci-plan/brief.md
@@ -0,0 +1,35 @@
+we are working on making a CI server
+
+I want you to work in an autonomous loop over the next few days until the CI server is fully functional, polished and documented
+
+on any PR on git.autonomic.zone it should be invokable by writing !testme as a comment 
+
+this should invoke the set of CI tests to be run for the recipe code at that PR 
+
+the CI tests should be run via drone 
+
+the tests run for a recipe should be written in python. e2e testing via playwright should be used whe necessary to confirm functionality 
+
+there should be tests which test 
+- new install
+- upgrade 
+- backups (including restore)
+
+all the tests should be fully e2e, with a real deployed recipe 
+
+the CI runner should be deployed on a server called cc-ci which is running nixos
+
+cc-ci git repo should also live on git.autonomic.zone which contains all the nix configuration for the server, as well as the code for the CI test runner
+
+the CI test runner should have its own folder of tests, with one folder for each recipe, with each of those folders containg a set of tests as python files which get invoked for that recipe 
+
+secrets should also be handled in a reasonable and repeatable way 
+
+additionally, if a recipe repo itself contains a tests folder in the recipe, the CI runner should also invoke those tests as part of the CI run for those tests 
+
+the results of the test run should be easily viewable, with trackable logs, and a final result, very similar in style to the way the yunohost CI runner looks and feels
+
+you will have ssh access to cc-ci server, as well as sudo access there
+
+you will also have access to create and modify repos on git.autonomic.zone
+
diff --git a/cc-ci-plan/kickoff.md b/cc-ci-plan/kickoff.md
new file mode 100644
index 0000000..0877e6f
--- /dev/null
+++ b/cc-ci-plan/kickoff.md
@@ -0,0 +1,143 @@
+# cc-ci — Kickoff & Launch
+
+Everything needed to start the autonomous cc-ci build loop. The substance lives in `plan.md`;
+this file explains how to launch and supervise the two agents.
+
+## Folder contents
+
+```
+cc-ci-plan/
+├── plan.md             # THE plan — single source of truth (read this in full)
+├── brief.md            # original one-page brief (context only; superseded by plan.md)
+├── kickoff.md          # this file — how to launch & supervise
+├── launch.sh           # starts both loops + watchdog, stops on ## DONE
+└── prompts/
+    ├── builder.md      # Builder loop prompt (fed to claude by launch.sh)
+    └── adversary.md    # Adversary loop prompt
+```
+
+> Note: `/srv/cc-ci/cc-ci-plan/` (this folder) is the **planning + launch material**. The actual
+> CI project — NixOS config, runner, tests — lives in a **separate git repo** the Builder creates
+> at `git.autonomic.zone/recipe-maintainers/cc-ci`, cloned to `/srv/cc-ci/cc-ci` (Builder) and
+> `/srv/cc-ci/cc-ci-adv` (Adversary). Don't confuse the two.
+
+## Model: two independent loops (plan §6 / §6.1)
+
+- **Builder** — builds the CI server; owns code + `STATUS.md`/`JOURNAL.md`/`DECISIONS.md` + the
+  `## Build backlog` section of `BACKLOG.md`.
+- **Adversary** — independently disbelieves and re-verifies; owns `REVIEW.md` + `## Adversary
+  findings`. Holds veto over `## DONE`.
+
+They run as two separate processes and coordinate **only** through the git repo. Single-writer file
+ownership keeps concurrent pushes merge-clean.
+
+## Two layers of "looping" — and why you want both
+
+| Concern | Mechanism | Who provides it |
+|---|---|---|
+| **Iteration** — keep doing one unit of work, then wake again | `/loop` self-paced (ScheduleWakeup), per plan §7 pacing | each agent, in-session |
+| **Resilience** — restart a loop whose process/sandbox died; stop all on `## DONE` | `launch.sh` watchdog (tmux + git poll) | this script |
+
+`/loop` alone is bound to its process: if the sandbox restarts, that loop is gone until something
+relaunches it. The watchdog is that something. Use both.
+
+## Launch
+
+```bash
+cd /srv/cc-ci/cc-ci-plan
+
+# Optional but recommended once the repo exists, so the watchdog can detect ## DONE:
+export CC_CI_REPO=https://git.autonomic.zone/recipe-maintainers/cc-ci.git
+
+./launch.sh start        # starts cc-ci-builder + cc-ci-adv + cc-ci-watchdog (tmux sessions)
+./launch.sh status       # session + DONE state
+./launch.sh logs builder # tail a loop;  also: logs adversary | logs watchdog
+tmux attach -t cc-ci-builder   # watch a loop live locally (detach: Ctrl-b d)
+./launch.sh stop         # stop everything
+```
+
+`launch.sh` is idempotent — re-running `start` won't duplicate a live session. Each agent runs as an
+**interactive** `claude` in tmux (kickoff prompt passed as a positional arg, *not* piped — piping
+forces print mode and breaks `/loop`). With `REMOTE_CONTROL=1` (default) each agent is launched with
+`--remote-control`, so you can **watch and steer both loops from [claude.ai/code](https://claude.ai/code)**
+(or the Claude mobile app) — not just via `tmux attach`. The box must be logged into the claude.ai
+account (`claude auth status`); set `REMOTE_CONTROL=0` to skip the remote surface. The watchdog
+(default every 300s) restarts any dead session — note a >~10-min network outage will exit the
+`claude` process, after which the watchdog brings it back (a fresh remote-control session) — and
+when `STATUS.md` shows `## DONE`, it kills the loops and exits.
+
+Prerequisites the sessions inherit from your shell: SSH (root) to `cc-ci` via the Tailscale proxy
+(§1.5), Gitea bot creds, and `git.autonomic.zone` access. Plus **preconfigured** operator inputs the
+loop depends on (plan §4.0/§4.4): the wildcard `*.ci.commoninternet.net` DNS record pointing at a
+gateway that TLS-passthroughs to cc-ci, and the **pre-issued wildcard cert** at
+`/var/lib/ci-certs/live/` on cc-ci. The operator owns the DNS record + gateway + cert
+issuance/renewal; the agent builds Traefik (file provider → that cert) + routing on cc-ci and does
+**no ACME**. If any prerequisite is absent, the Builder parks at `STATUS.md ## Blocked` (plan §1/§9)
+rather than improvise.
+
+> Host deps: `launch.sh` needs **tmux** (and `claude`) — tmux is installed on this sandbox host
+> (3.5a). On a fresh host: `sudo apt-get install -y tmux`. The script's `*_DIR`
+> defaults now point at `/srv/cc-ci/...` (Builder clone `/srv/cc-ci/cc-ci`, Adversary
+> `/srv/cc-ci/cc-ci-adv`); override the `*_DIR` env vars only if your layout differs.
+
+## Optional: a cloud-side `/schedule` watchdog
+
+`launch.sh`'s watchdog is itself a local process — if the *whole host* goes down it stops too. For
+belt-and-suspenders durability, also create a `/schedule` routine (a remote agent that fires on a
+cron and re-orients from the repo). From inside a Claude session:
+
+```
+/schedule every 2 hours: read /srv/cc-ci/cc-ci-plan/plan.md §7 and the cc-ci repo STATUS.md; if the
+Builder/Adversary loops are not making progress (or launch.sh is not running), restart them via
+/srv/cc-ci/cc-ci-plan/launch.sh start; stop when STATUS.md says ## DONE.
+```
+
+This complements the local watchdog: scheduled runs are fresh, independent agents, so they survive
+process/context death that would take the in-session `/loop` and the local watchdog with it.
+
+## Fallback: restart/recreate the cc-ci VM (orchestrator only)
+
+**This is primarily an escape hatch for *you*, the supervising orchestrator.** The loops normally
+reconfigure cc-ci only from inside (via Nix); power-cycling or recreating the VM shouldn't be their
+default move — but it's not forbidden if one gets genuinely stuck. Reach for this when cc-ci itself
+is wedged at a level that can't be fixed from inside (won't boot, disk full, swarm/Docker corrupted,
+unreachable even after a proxy restart): use the Incus skill to power-cycle or rebuild the VM, then
+re-bootstrap.
+
+`cc-nix-test` (the cc-ci server, tailnet `100.90.116.4`) is a **NixOS Incus VM** on host **b1**
+(`100.117.251.31:8443`, Incus project `terraform-ci`). Skill + Terraform live at
+`/srv/incus-terraform-nix-vm-creator/` (`skills/incus-terraform/SKILL.md`); read that for full usage.
+
+- **Access:** b1 is on the *same* cc-ci tailnet, so reach the Incus API through the existing
+  `cc-ci-tailscaled` SOCKS proxy (`127.0.0.1:1055`) with the mTLS certs in that repo's
+  `terraform-secrets/` — no second tailscaled needed. Quick check:
+  ```bash
+  CRT=/srv/incus-terraform-nix-vm-creator/terraform-secrets/terraform.crt
+  KEY=/srv/incus-terraform-nix-vm-creator/terraform-secrets/terraform.key
+  curl --proxy socks5h://localhost:1055 --cert "$CRT" --key "$KEY" -k -s \
+    https://100.117.251.31:8443/1.0/instances/cc-nix-test/state?project=terraform-ci
+  ```
+- **Soft restart (keeps the disk — preferred):** `POST .../1.0/instances/cc-nix-test/state?project=terraform-ci`
+  with `{"action":"restart"}` (or `"stop"` / `"start"`).
+- **Full recreate (last resort):** the Terraform module in `/srv/incus-terraform-nix-vm-creator/projects/`
+  (`terraform apply` with `-var incus_remote_address=100.117.251.31 -var incus_project=terraform-ci
+  -var ts_auth_key=$TSKEY`). ⚠ **Recreating wipes the VM disk** — you must then re-apply the cc-ci
+  preconditions: the pre-issued TLS cert into `/var/lib/ci-certs/live/` and the
+  `cc-ci-root-ed25519` pubkey into root's `authorized_keys` (see the access notes), and the loops
+  re-run §1 Bootstrap. Prefer a soft restart; only recreate if the VM is truly unrecoverable.
+
+(Project cap: keep total RAM across `terraform-ci` instances under 10 GB — check before recreating.)
+
+## Manual launch (no script)
+
+If you'd rather not use `launch.sh`, start each agent interactively yourself (same result, no
+supervision/restart), passing the prompt as a positional argument so the session stays interactive
+and remote-controllable:
+
+```bash
+claude --remote-control 'cc-ci-builder' --dangerously-skip-permissions "$(cat prompts/builder.md)"
+claude --remote-control 'cc-ci-adv'     --dangerously-skip-permissions "$(cat prompts/adversary.md)"
+```
+
+Do **not** pipe the prompt (`cat prompts/builder.md | claude …`) — that forces print/headless mode,
+which breaks `/loop` and remote control.
diff --git a/cc-ci-plan/launch.sh b/cc-ci-plan/launch.sh
new file mode 100755
index 0000000..ab9f17f
--- /dev/null
+++ b/cc-ci-plan/launch.sh
@@ -0,0 +1,203 @@
+#!/usr/bin/env bash
+#
+# launch.sh — start and supervise the two cc-ci autonomous loops + a watchdog.
+#
+# Model (see plan.md §6 / §6.1): two INDEPENDENT Claude Code sessions —
+#   • Builder   (tmux session: cc-ci-builder)   working clone /srv/cc-ci/cc-ci
+#   • Adversary (tmux session: cc-ci-adv)        working clone /srv/cc-ci/cc-ci-adv
+# coordinating only through the git repo on git.autonomic.zone.
+#
+# Each agent self-paces with a `/loop` (ScheduleWakeup) — that handles ITERATION.
+# This script's watchdog handles RESILIENCE: it restarts a session that has died
+# and stops everything once STATUS.md reports "## DONE".
+#
+# Usage:
+#   ./launch.sh start       # start both loops + watchdog (idempotent)
+#   ./launch.sh watchdog    # run only the supervision loop in the foreground
+#   ./launch.sh status      # show session + DONE state
+#   ./launch.sh logs builder|adversary|watchdog   # tail a session/log
+#   ./launch.sh stop        # stop both loops + watchdog
+#
+# Configure via env vars (defaults below). At minimum set CC_CI_REPO once the
+# Builder has created the repo, so the watchdog can detect DONE.
+
+set -euo pipefail
+
+# ----- config -------------------------------------------------------------
+PLAN_DIR="${PLAN_DIR:-/srv/cc-ci/cc-ci-plan}"
+CLAUDE_BIN="${CLAUDE_BIN:-claude}"
+# Flags for unattended operation in a sandbox. Override if your setup differs.
+CLAUDE_FLAGS="${CLAUDE_FLAGS:---dangerously-skip-permissions}"
+# REMOTE_CONTROL=1 launches each agent as an INTERACTIVE session with --remote-control,
+# viewable/steerable at claude.ai/code (and the Claude mobile app). This is required for
+# /loop + ScheduleWakeup to work at all (they are interactive-only — a piped/print-mode
+# session cannot self-pace). Set REMOTE_CONTROL=0 for a plain interactive session with no
+# remote surface. The box must be logged into the claude.ai account (run `claude` once to
+# check `claude auth status`). Each agent gets its own RC session named after its tmux session.
+REMOTE_CONTROL="${REMOTE_CONTROL:-1}"
+
+BUILDER_DIR="${BUILDER_DIR:-/srv/cc-ci/cc-ci}"        # Builder's repo clone (it creates this)
+ADV_DIR="${ADV_DIR:-/srv/cc-ci/cc-ci-adv}"            # Adversary's repo clone
+WATCH_DIR="${WATCH_DIR:-/srv/cc-ci/.cc-ci-watch}"     # tiny clone the watchdog reads STATUS.md from
+LOG_DIR="${LOG_DIR:-/srv/cc-ci/.cc-ci-logs}"
+
+CC_CI_REPO="${CC_CI_REPO:-https://git.autonomic.zone/recipe-maintainers/cc-ci.git}"  # CI project repo (DONE detection); harmless until the Builder creates it
+CC_CI_BRANCH="${CC_CI_BRANCH:-main}"
+
+WATCH_INTERVAL="${WATCH_INTERVAL:-300}"   # seconds between watchdog checks
+
+BUILDER_SESSION="cc-ci-builder"
+ADV_SESSION="cc-ci-adv"
+WATCHDOG_SESSION="cc-ci-watchdog"
+# --------------------------------------------------------------------------
+
+log() { printf '[launch %(%H:%M:%S)T] %s\n' -1 "$*"; }
+die() { log "ERROR: $*"; exit 1; }
+
+need() { command -v "$1" >/dev/null 2>&1 || die "missing dependency: $1"; }
+
+preflight() {
+  need tmux
+  command -v "$CLAUDE_BIN" >/dev/null 2>&1 || die "claude CLI not found (set CLAUDE_BIN)"
+  [[ -f "$PLAN_DIR/prompts/builder.md"   ]] || die "missing $PLAN_DIR/prompts/builder.md"
+  [[ -f "$PLAN_DIR/prompts/adversary.md" ]] || die "missing $PLAN_DIR/prompts/adversary.md"
+  mkdir -p "$LOG_DIR"
+}
+
+session_alive() { tmux has-session -t "$1" 2>/dev/null; }
+
+# Start one agent loop in its own tmux session, cd'd into its working dir, with
+# the kickoff prompt passed to claude as a positional argument (see below for why
+# not stdin).
+start_agent() {
+  local session="$1" workdir="$2" prompt_file="$3"
+  if session_alive "$session"; then
+    log "$session already running — leaving it"
+    return 0
+  fi
+  mkdir -p "$workdir"
+  log "starting $session (cwd=$workdir, remote_control=$REMOTE_CONTROL)"
+  # tmux gives claude a real PTY, so we run claude INTERACTIVELY (required for /loop +
+  # ScheduleWakeup). The kickoff prompt is passed as a POSITIONAL argument via an inner
+  # `$(cat ...)` — NOT piped on stdin, because piping forces print/headless mode which
+  # breaks both interactivity and --remote-control. The `\$(...)` defers to the inner shell
+  # so the whole multi-line prompt arrives as a single argument.
+  local rc=""
+  [[ "$REMOTE_CONTROL" == "1" ]] && rc="--remote-control '$session'"
+  tmux new-session -d -s "$session" -c "$workdir" \
+    "$CLAUDE_BIN $rc $CLAUDE_FLAGS \"\$(cat '$prompt_file')\""
+  # Log the pane WITHOUT redirecting claude's stdout: a `>>log` redirect makes stdout a
+  # non-tty and drops claude out of interactive/remote-control mode. pipe-pane mirrors the
+  # live pane to the log file while claude keeps the PTY tmux gave it.
+  tmux pipe-pane -o -t "$session" "cat >> '$LOG_DIR/$session.log'"
+}
+
+start_loops() {
+  start_agent "$BUILDER_SESSION" "$BUILDER_DIR" "$PLAN_DIR/prompts/builder.md"
+  start_agent "$ADV_SESSION"     "$ADV_DIR"     "$PLAN_DIR/prompts/adversary.md"
+}
+
+# Returns 0 (true) if the repo's STATUS.md contains a "## DONE" heading.
+is_done() {
+  [[ -n "$CC_CI_REPO" ]] || return 1
+  if [[ ! -d "$WATCH_DIR/.git" ]]; then
+    git clone --depth 1 --branch "$CC_CI_BRANCH" "$CC_CI_REPO" "$WATCH_DIR" >/dev/null 2>&1 || return 1
+  fi
+  git -C "$WATCH_DIR" fetch --depth 1 origin "$CC_CI_BRANCH" >/dev/null 2>&1 || return 1
+  git -C "$WATCH_DIR" reset --hard "origin/$CC_CI_BRANCH" >/dev/null 2>&1 || return 1
+  grep -qE '^##[[:space:]]+DONE' "$WATCH_DIR/STATUS.md" 2>/dev/null
+}
+
+watchdog_loop() {
+  log "watchdog up (interval=${WATCH_INTERVAL}s, repo=${CC_CI_REPO:-<unset: DONE-detection disabled>})"
+  while true; do
+    # 1) DONE? then wind everything down.
+    if is_done; then
+      log "STATUS.md reports ## DONE — stopping loops."
+      stop_loops
+      log "watchdog exiting (project complete)."
+      exit 0
+    fi
+    # 2) restart any dead loop (resilience the in-session /loop can't provide).
+    if ! session_alive "$BUILDER_SESSION"; then
+      log "builder session gone — restarting"
+      start_agent "$BUILDER_SESSION" "$BUILDER_DIR" "$PLAN_DIR/prompts/builder.md"
+    fi
+    if ! session_alive "$ADV_SESSION"; then
+      log "adversary session gone — restarting"
+      start_agent "$ADV_SESSION" "$ADV_DIR" "$PLAN_DIR/prompts/adversary.md"
+    fi
+    sleep "$WATCH_INTERVAL"
+  done
+}
+
+start_watchdog() {
+  if session_alive "$WATCHDOG_SESSION"; then
+    log "watchdog already running"
+    return 0
+  fi
+  log "starting watchdog"
+  tmux new-session -d -s "$WATCHDOG_SESSION" -c "$PLAN_DIR" \
+    "exec >>'$LOG_DIR/watchdog.log' 2>&1; '$0' watchdog"
+}
+
+stop_loops() {
+  for s in "$BUILDER_SESSION" "$ADV_SESSION"; do
+    if session_alive "$s"; then log "killing $s"; tmux kill-session -t "$s" || true; fi
+  done
+}
+
+cmd_status() {
+  for s in "$BUILDER_SESSION" "$ADV_SESSION" "$WATCHDOG_SESSION"; do
+    if session_alive "$s"; then echo "  $s: RUNNING"; else echo "  $s: stopped"; fi
+  done
+  if [[ -n "$CC_CI_REPO" ]]; then
+    if is_done; then echo "  project: ## DONE"; else echo "  project: in progress"; fi
+  else
+    echo "  project: (CC_CI_REPO unset — DONE-detection disabled)"
+  fi
+}
+
+case "${1:-}" in
+  start)
+    preflight
+    start_loops
+    start_watchdog
+    log "started. inspect with: ./launch.sh status   |   attach: tmux attach -t $BUILDER_SESSION"
+    ;;
+  watchdog)  preflight; watchdog_loop ;;
+  status)    cmd_status ;;
+  logs)
+    case "${2:-}" in
+      builder)   tail -f "$LOG_DIR/$BUILDER_SESSION.log" ;;
+      adversary) tail -f "$LOG_DIR/$ADV_SESSION.log" ;;
+      watchdog)  tail -f "$LOG_DIR/watchdog.log" ;;
+      *) die "usage: $0 logs builder|adversary|watchdog" ;;
+    esac
+    ;;
+  stop)
+    stop_loops
+    if session_alive "$WATCHDOG_SESSION"; then log "killing $WATCHDOG_SESSION"; tmux kill-session -t "$WATCHDOG_SESSION" || true; fi
+    log "stopped."
+    ;;
+  *)
+    cat <<EOF
+cc-ci loop launcher
+
+  $0 start                          start both loops + watchdog (idempotent)
+  $0 status                         show session + DONE state
+  $0 logs builder|adversary|watchdog   tail a log
+  $0 stop                           stop everything
+  $0 watchdog                       run supervision loop in foreground
+
+Key env vars (current value):
+  CC_CI_REPO      = ${CC_CI_REPO:-<unset — set to enable DONE detection>}
+  CLAUDE_BIN      = $CLAUDE_BIN
+  CLAUDE_FLAGS    = $CLAUDE_FLAGS
+  REMOTE_CONTROL  = $REMOTE_CONTROL  (1 = interactive --remote-control, viewable at claude.ai/code)
+  BUILDER_DIR    = $BUILDER_DIR
+  ADV_DIR        = $ADV_DIR
+  WATCH_INTERVAL = ${WATCH_INTERVAL}s
+EOF
+    ;;
+esac
diff --git a/cc-ci-plan/plan.md b/cc-ci-plan/plan.md
new file mode 100644
index 0000000..fade52c
--- /dev/null
+++ b/cc-ci-plan/plan.md
@@ -0,0 +1,635 @@
+# cc-ci — Co-op Cloud Recipe CI Server (Autonomous Build Plan)
+
+**Status:** ACTIVE — autonomous loop
+**Owner agent:** Builder (primary) + Adversary (reviewer)
+**Source brief:** `brief.md` (do not edit; this file supersedes it)
+**This file's canonical path:** `/srv/cc-ci/cc-ci-plan/plan.md`
+**Target server:** `cc-ci` (NixOS)
+**Code/config home:** `git.autonomic.zone/recipe-maintainers/cc-ci` (the CI project repo — distinct from this
+`/srv/cc-ci/cc-ci-plan/` planning+launch folder)
+**Last updated:** keep current via `STATUS.md` (see §7)
+
+---
+
+## 0. How to read this document
+
+This plan is written to be handed to an **autonomous Claude agent running in a sandbox over
+several days**, driving itself in a loop until the CI server is "done" per §2. A second agent
+(the **Adversary**) independently tries to disprove every "done" claim. Neither agent is
+trusted to mark its own work complete.
+
+If you are an agent waking up into this loop for the first time, go straight to **§1 Bootstrap**.
+On every subsequent wake, go to **§7 The Loop Protocol** and continue from `STATUS.md`.
+
+The rest of the document (§3–§6) is the technical design. Treat it as the default architecture,
+but you are allowed to revise it when reality disagrees — record any deviation in `DECISIONS.md`
+with a one-line rationale.
+
+---
+
+## 1. Bootstrap (first wake only)
+
+Do these in order. Each step is idempotent; re-running is safe.
+
+1. **Verify access.** (Full credential map + how each is used is in **§1.5** — read it first.)
+   - `ssh cc-ci 'hostname && whoami'` — you log in as **root** on cc-ci (NixOS), so there is no
+     separate sudo step. `ssh cc-ci` is preconfigured to tunnel through the userspace-tailscaled
+     SOCKS proxy (§1.5); if it fails, the proxy/daemon is probably down — restart it (§1.5) before
+     declaring blocked.
+   - `ssh cc-ci 'nixos-version'` — confirm NixOS.
+   - Confirm you can reach the Gitea API with the bot creds from `.testenv` (§1.5):
+     `curl -s https://$GITEA_URL/api/v1/version`. The bot authenticates with
+     `GITEA_USERNAME`/`GITEA_PASSWORD` (basic auth) or a token you mint from them via
+     `POST /api/v1/users/<user>/tokens` — do **not** expect a ready-made `$GITEA_TOKEN`.
+   - Confirm the **preconfigured** test-app DNS (§4.0/§4.4): a random subdomain under the wildcard
+     resolves, e.g. `getent hosts probe-$RANDOM.ci.commoninternet.net` returns the **gateway's** IP
+     (not cc-ci's — the gateway TLS-passthroughs to cc-ci, so do not expect cc-ci's address; and use
+     `getent`, not `dig`, since this host's resolver is Tailscale-only — see §1.5).
+     Traefik is *not* up yet — you configure it (file provider → the pre-issued cert at
+     `/var/lib/ci-certs/live/`, **no ACME**); the DNS record + gateway passthrough + cert are the
+     preconditions, and full end-to-end HTTPS reachability is proven at M1, not now.
+     If the wildcard does not resolve at all, that's a `## Blocked` item (operator fixes DNS/gateway).
+   - If any check fails, write the failure to `STATUS.md` under `## Blocked` and stop — a human must fix access. Do **not** try to work around missing access.
+
+2. **Create the `cc-ci` repo** on git.autonomic.zone if it does not exist. Push an initial
+   skeleton (see §3 layout). The Builder clones to `/srv/cc-ci/cc-ci`; the Adversary loop keeps
+   its **own independent clone** at `/srv/cc-ci/cc-ci-adv`. The repo is the only channel between
+   the two loops (§6.1) — loop state lives inside it (`STATUS.md`, `BACKLOG.md`, etc.).
+
+3. **Snapshot the starting environment** into `cc-ci/docs/baseline.md`: current NixOS config on
+   the server (`/etc/nixos` or existing flake), installed packages, whether Docker/Swarm/abra
+   already exist, DNS that already points at the box. This is the rollback reference.
+
+4. **Seed the loop state files** (§7) if absent: `STATUS.md`, `BACKLOG.md`, `REVIEW.md`,
+   `JOURNAL.md`, `DECISIONS.md`. Give `BACKLOG.md` two H2 sections — `## Build backlog`
+   (populated from §5 milestones) and `## Adversary findings` (empty) — per the single-writer
+   rule in §6.1.
+
+5. Commit ("chore: bootstrap cc-ci loop state") and begin the loop at §7.
+
+---
+
+## 1.5 Credentials & access — where everything lives and how to use it
+
+The loops run **on the sandbox host** (not on cc-ci) and reach cc-ci over Tailscale. This section
+is the authoritative map of what credentials exist, where, and how to use them. **Never copy any
+secret value into the repo, a commit, a log, or the dashboard** (§9) — reference locations only.
+
+### Provided credentials (already in place)
+
+| What | Where | How to use |
+|---|---|---|
+| **Tailscale auth key** (joins cc-ci's tailnet `taila4a0bf.ts.net`) | `/srv/cc-ci/.testenv` → `TS_AUTH_KEY` (Tailscale SaaS key, keyID ends `CNTRL`) | Used to bring up the userspace tailscaled (below). It's reusable; re-run `tailscale up` with it if the node drops. |
+| **cc-ci SSH (root)** | private key `~/.ssh/cc-ci-root-ed25519`; config `Host cc-ci` in `~/.ssh/config` | Just run `ssh cc-ci` (logs in as **root**). The pubkey is already in cc-ci's `/root/.ssh/authorized_keys`. |
+| **Gitea bot account** | `/srv/cc-ci/.testenv` → `GITEA_USERNAME` (`autonomic-bot`), `GITEA_PASSWORD`, `GITEA_URL` (`git.autonomic.zone`) | Basic-auth to the Gitea API, or mint a scoped token: `POST https://$GITEA_URL/api/v1/users/$GITEA_USERNAME/tokens`. Used to create/push the `cc-ci` repo, read recipe repos, comment on PRs, and register `!testme` webhooks. |
+
+Load them in a shell with: `set -a; . /srv/cc-ci/.testenv; set +a` (don't echo the values).
+
+### The Tailscale connection (how `ssh cc-ci` and the proxy work)
+
+cc-ci (`cc-nix-test`, **100.90.116.4**) is on a *different* tailnet than the sandbox host's default
+one, so it is reached via a **second, userspace tailscaled** — this keeps the host's own tailnet
+untouched. State lives in `~/.cc-ci-ts/`; it exposes a **SOCKS5/HTTP proxy on `127.0.0.1:1055`**,
+which is the only route to that tailnet (userspace networking ⇒ the host OS can't route the tailnet
+IPs directly).
+
+It runs as a **persistent systemd service** (`cc-ci-tailscaled.service`, enabled, `Restart=always`,
+starts on boot; unit at `/etc/systemd/system/cc-ci-tailscaled.service`, runs as user `notplants`).
+It reuses the already-authenticated state in `~/.cc-ci-ts/`, so it reconnects across reboots/crashes
+without the auth key.
+
+- `ssh cc-ci` works out of the box (its `ProxyCommand` uses the proxy; logs in as root).
+- For HTTP(S) to cc-ci / `*.ci.commoninternet.net` from the sandbox, go through the proxy, e.g.
+  `curl --proxy socks5h://localhost:1055 https://<app>.ci.commoninternet.net`.
+- **If connectivity is down:** `sudo systemctl restart cc-ci-tailscaled` (diagnose with
+  `systemctl status cc-ci-tailscaled` / `journalctl -u cc-ci-tailscaled`). A dead proxy is an access
+  failure to recover, not a `## Blocked`-and-stop condition — *unless* the auth key itself is
+  rejected (then re-auth with `tailscale --socket=$HOME/.cc-ci-ts/tailscaled.sock up
+  --auth-key="$TS_AUTH_KEY" --hostname=cc-ci-claude-sandbox --accept-routes --accept-dns=false`, and
+  if that fails the key is a class-A1 blocker).
+- **DNS gotcha:** this host's `/etc/resolv.conf` lists only Tailscale resolvers, so direct
+  `dig @1.1.1.1 …` queries get no answer and look falsely empty. Use `getent hosts <name>` to
+  resolve from the sandbox. `commoninternet.net` itself is a normal public zone hosted at **Gandi**.
+
+### Credentials the loop GENERATES itself (do not wait on a human for these)
+
+- **Drone RPC secret** and **webhook HMAC secret** — generate (`openssl rand -hex 32`), store
+  sops-encrypted in `secrets/`, and wire both ends. Internal shared secrets, not human inputs.
+- **Gitea OAuth app for Drone** — create it under the bot account via the API
+  (`POST /api/v1/user/applications/oauth2`); capture client id/secret into `secrets/`.
+- **cc-ci host age/GPG key for sops** — generate on the host (or derive from its SSH host key);
+  add as a sops recipient. Keep a recovery copy of the master age identity off-box if desired.
+- **Per-recipe app secrets** (class-B, §4.4) — the harness generates these per run.
+
+### Credentials STILL NEEDED from the operator (class-A — block if missing, per §9)
+
+- **Wildcard TLS cert — PROVIDED, not a token.** The operator has pre-issued the wildcard SAN cert
+  (`*.ci.commoninternet.net` + `ci.commoninternet.net`) and placed it on cc-ci at
+  `/var/lib/ci-certs/live/{fullchain.pem,privkey.pem}` (§4.0). The agent points Traefik's file
+  provider at those paths and runs **no ACME** for this domain. **Do not request or expect a
+  `commoninternet.net` DNS token** — issuance/renewal is handled out-of-band by the operator (LE
+  90-day cert; next renewal ~2026-08-24). A missing/expired cert is a finding for the operator, not
+  an agent re-issue.
+- **Registry pull credentials** (e.g. Docker Hub) — *recommended* to avoid anonymous pull-rate
+  limits breaking deploys under load. Treat a rate-limit failure traced to this as a finding, then
+  request creds. Store sops-encrypted in `secrets/`.
+- **Gitea bot permissions** (a grant, not a secret) — confirm `autonomic-bot` can: create/push
+  `recipe-maintainers/cc-ci`, read the recipe repos to be enrolled, comment on their PRs, and add
+  webhooks to them. If any is missing, that's a `## Blocked` item for the operator to fix.
+
+---
+
+## 2. Definition of Done (the loop's exit condition)
+
+The loop terminates **only** when every item below is true *and the Adversary has independently
+re-verified each one within the last 24h* (logged in `REVIEW.md` with timestamps and command
+output). Partial credit does not count.
+
+- [ ] **D1 — Trigger.** Commenting `!testme` on any open PR in any enrolled recipe repo on
+      git.autonomic.zone starts a CI run for the code *at that PR's head commit* within 60s.
+      Other comments do not. Re-commenting re-runs.
+- [ ] **D2 — Test matrix.** For a recipe under test, the run executes, as separate reported
+      stages: **new install**, **upgrade** (previous published version → PR version), and
+      **backup + restore**. All are genuine end-to-end against a really-deployed recipe (real
+      containers, real Traefik routing, real volumes) — no mocks, no stubs.
+- [ ] **D3 — Python + Playwright.** Tests are Python. Functional assertions that require a
+      browser use Playwright against the live deployed app.
+- [ ] **D4 — Recipe-local tests.** If the recipe repo contains its own `tests/` folder, those
+      tests are also discovered and run as part of the same CI run, with results merged in.
+- [ ] **D5 — Per-recipe test tree.** The cc-ci repo holds `tests/<recipe>/` with the
+      install/upgrade/backup tests as Python files, plus a shared harness. Adding a new recipe is
+      a documented, small, repeatable operation.
+- [ ] **D6 — Secrets.** App + infra secrets are handled reproducibly (committed encrypted,
+      decrypted on the server), documented, and rotatable. No plaintext secrets in git, logs, or
+      the results UI.
+- [ ] **D7 — Results UX.** Each run has a stable URL with live, tail-able logs per stage and a
+      final pass/fail; there is an overview page listing recipes with their latest status —
+      look-and-feel comparable to the YunoHost app CI (`ci-apps.yunohost.org`). A PR comment links
+      back to its run and reflects the outcome.
+- [ ] **D8 — Reproducible server.** The entire server (Drone, runner, comment bridge, swarm,
+      Traefik, dashboard, secrets wiring) is declared in the `cc-ci` repo's NixOS flake and can be
+      rebuilt from scratch onto a blank NixOS host following `docs/install.md`, verified by the
+      Adversary doing exactly that on a throwaway VM (or documenting why a full from-scratch
+      rebuild was infeasible and what was tested instead).
+- [ ] **D9 — Documentation.** `README.md` + `docs/` explain architecture, how to enroll a recipe,
+      how to add/run tests locally, how to operate/rotate secrets, and how to debug a failed run.
+      A new engineer can enroll a recipe and get a green run using only the docs.
+- [ ] **D10 — Proof (breadth).** At least **six real recipes** spanning the meaningful
+      categories have a full green run triggered by `!testme` on a real PR, with all three stages
+      (install / upgrade / backup+restore) actually exercised. The set must cover:
+      a stateless/simple app, a single-DB app, a multi-service app, an SSO/identity app, and an
+      object-storage/large-volume app. **Target set (all previously verified deployable):**
+      `hedgedoc` (simple), `cryptpad` (stateful, no external DB), `keycloak` + `authentik`
+      (SSO/identity, DB-backed), `lasuite-docs` and/or `lasuite-drive` (multi-service + S3/MinIO),
+      `matrix-synapse` (DB + media store), `immich` (large volumes + Postgres), `bluesky-pds`
+      (TLS-passthrough/atproto). Pick six that together satisfy the categories; record the chosen
+      set and per-recipe green-run evidence in `REVIEW.md`. Any recipe that genuinely cannot be CI'd
+      is a documented finding (in `DECISIONS.md`) with the reason, not a silent omission.
+
+When all of D1–D10 hold and are Adversary-verified, write `## DONE` to `STATUS.md` with the
+evidence links and stop scheduling new iterations.
+
+---
+
+## 3. Repository layout (`git.autonomic.zone/recipe-maintainers/cc-ci`)
+
+```
+cc-ci/
+├── README.md
+├── flake.nix                 # NixOS host(s) + devshell
+├── flake.lock
+├── hosts/
+│   └── cc-ci/
+│       ├── configuration.nix # the cc-ci machine
+│       └── hardware.nix
+├── modules/
+│   ├── drone.nix             # Drone server + runner (exec/docker)
+│   ├── comment-bridge.nix    # !testme webhook listener service
+│   ├── swarm.nix             # Docker + single-node swarm + Traefik for test apps
+│   ├── dashboard.nix         # results overview site
+│   └── secrets.nix           # sops-nix / agenix wiring
+├── secrets/                  # sops-encrypted (*.enc / *.age); see §4.4
+│   └── secrets.yaml
+├── bridge/                   # comment-bridge source (small Go/Python service)
+├── runner/                   # CI orchestration entrypoint invoked by Drone
+│   ├── run_recipe_ci.py      # top-level: deploy→test→teardown for a recipe@ref
+│   └── harness/              # shared pytest fixtures (abra wrappers, app lifecycle)
+├── dashboard/                # results UI generator (reads Drone API → static site)
+├── tests/
+│   ├── conftest.py           # shared fixtures, recipe selection, teardown guarantees
+│   ├── <recipe>/
+│   │   ├── test_install.py
+│   │   ├── test_upgrade.py
+│   │   ├── test_backup.py
+│   │   └── playwright/       # e2e flows for this recipe
+│   └── _template/            # copy-to-add-a-recipe template
+├── docs/
+│   ├── install.md            # from-scratch server build (D8)
+│   ├── enroll-recipe.md      # how to add a recipe (D5)
+│   ├── secrets.md            # secret model + rotation (D6)
+│   ├── architecture.md
+│   ├── runbook.md            # debugging failed runs
+│   └── baseline.md           # bootstrap snapshot
+├── STATUS.md  BACKLOG.md  REVIEW.md  JOURNAL.md  DECISIONS.md   # loop state (§7)
+└── .drone.yml                # pipeline for cc-ci's own repo (lint/self-test)
+```
+
+---
+
+## 4. Technical design (default architecture)
+
+### 4.0 Domain model (where things live)
+
+Two DNS zones, deliberately separated — do **not** conflate them:
+
+- **`git.autonomic.zone` — source of truth for code (unchanged, not ours to reconfigure).**
+  The Gitea host: the enrolled recipe repos and the `cc-ci` config repo live here. The loop reads,
+  comments, and (when enrolling) adds a webhook here, but deploys **nothing** here. Per §9 this zone
+  is read/comment-only — never push recipe code, never point app DNS at it.
+- **`commoninternet.net` — the CI server's own zone; everything CI-facing.** A wildcard
+  `*.ci.commoninternet.net` resolves to a **gateway** (not cc-ci directly — see Network path below).
+  Under it:
+  - **Apps under test:** each run deploys to a unique subdomain
+    `<recipe>-pr<n>-<short-sha>.ci.commoninternet.net`, so concurrent runs never collide on a
+    hostname. The subdomain (app, volumes, secrets, Traefik route) is torn down at run end (§4.3).
+  - **Results dashboard:** `ci.commoninternet.net` — overview page + per-recipe status badges (§4.5).
+  - **Webhook bridge:** `ci.commoninternet.net/hook` — the Gitea `issue_comment` receiver (§4.1).
+- **Network path (gateway → TLS passthrough → cc-ci).** The wildcard record does **not** point at
+  cc-ci's IP. It points at a gateway that **passes TLS through** to cc-ci: the gateway routes by SNI
+  and forwards the raw encrypted stream without decrypting it, so TLS still **terminates on cc-ci's
+  Traefik**. Consequences the agent must respect:
+  - `dig <sub>.ci.commoninternet.net` returns the **gateway's** IP, not cc-ci's — do not assert the
+    record points at cc-ci. Reachability is proven end-to-end (an HTTPS request lands on cc-ci),
+    not by comparing A records.
+  - The gateway is assumed to passthrough the **whole wildcard**, so a fresh per-run subdomain needs
+    **no gateway change** and **no cert work** (the pre-issued wildcard already covers it) — the
+    agent only adds the Traefik **router** on cc-ci. (If the gateway
+    instead needs per-host config, that's an operator/gateway concern and a `## Blocked` item, not
+    something the agent reconfigures — the gateway is not ours, only cc-ci is, per §9.)
+  - The gateway is operator-managed and out of scope; the agent configures only cc-ci.
+  - **Caveat for TLS-passthrough recipes** (e.g. `bluesky-pds`, §2 D10): the default path terminates
+    TLS at cc-ci's Traefik. A recipe that expects to terminate TLS in its own container needs cc-ci's
+    Traefik configured to passthrough that host too (the outer gateway already passes the whole
+    wildcard). Treat this as a per-recipe harness quirk to absorb (§5 M6.5), or pick a non-passthrough
+    recipe for that D10 category and record the swap in `DECISIONS.md` — not a silent omission.
+- **Wildcard TLS — operator pre-issues, agent serves it statically (no token in the agent).**
+  Routing and certs are separate: the preconfigured wildcard DNS solves routing only; a cert is
+  still needed because the gateway passes TLS through and cc-ci's Traefik terminates it. **The cert
+  is pre-provisioned out-of-band** so the DNS-editing token never enters the agent/repo. A wildcard
+  SAN cert covering **`*.ci.commoninternet.net` + `ci.commoninternet.net`** (issued via Let's
+  Encrypt DNS-01 against Gandi, by the operator, using a token the agent never sees) lives on cc-ci:
+  - `/var/lib/ci-certs/live/fullchain.pem` (leaf+intermediate) and `…/privkey.pem`.
+  - The agent configures **Traefik's file provider** (`tls.certificates`, `certFile`/`keyFile`
+    pointing at those paths) to serve it, and runs **no ACME resolver** for this domain. One cert
+    covers every per-run subdomain, so spinning up an app domain needs no cert work at all.
+  - **Renewal is a manual operator task** (LE 90-day cert): the operator re-issues out-of-band and
+    drops the new files at the same paths (Traefik file provider hot-reloads). The agent must **not**
+    attempt ACME/DNS-01 for `commoninternet.net` and must **not** expect a DNS token — a missing/
+    expired cert is an operator action, surfaced as a finding, not something the agent re-issues.
+  (Rationale for choosing a wildcard cert over per-subdomain: a wildcard is reused for every churning
+  run subdomain and sidesteps LE's 50-certs/week-per-domain limit; only DNS-01 can mint a wildcard.
+  We keep that DNS-01 issuance with the operator rather than handing the agent the zone token.)
+- Record the live facts in `docs/install.md`: the zone + DNS provider (Gandi), that the wildcard
+  `*.ci.commoninternet.net` (and bare `ci.commoninternet.net`) point at the **gateway**, that the
+  gateway TLS-passthroughs the wildcard to cc-ci, the gateway's address, the TTL, and that the
+  wildcard cert is pre-issued/operator-renewed at `/var/lib/ci-certs/live/` (no DNS token on cc-ci).
+
+### 4.1 The `!testme` trigger path
+
+Gitea does not natively forward PR-comment events to Drone, and Drone's built-in triggers fire on
+push/PR-open, not on a magic comment. So:
+
+```
+PR comment "!testme"
+   │  Gitea webhook (issue_comment event)  ──►  comment-bridge (modules/comment-bridge.nix)
+   │                                              • verifies webhook HMAC secret
+   │                                              • checks comment body == "!testme" (exact, trimmed)
+   │                                              • checks commenter is allowed (org member / collaborator)
+   │                                              • resolves PR head repo + SHA via Gitea API
+   │                                              • calls Drone API: build for cc-ci pipeline,
+   │                                                params RECIPE=<repo> REF=<sha> PR=<n> SRC=<headrepo>
+   ▼
+Drone build (cc-ci repo pipeline, parameterized) ──► runner/run_recipe_ci.py
+   ▼
+Bridge posts/updates a Gitea PR comment with the run URL and (on completion) pass/fail.
+```
+
+- The bridge is a tiny service (Go or Python+FastAPI). Keep it dependency-light; it's a NixOS
+  systemd service behind Traefik at e.g. `ci.commoninternet.net/hook` (§4.0).
+- Enrollment = registering the Gitea webhook on a recipe repo (script in `runner/` or documented
+  in `enroll-recipe.md`) + ensuring a `tests/<recipe>/` dir exists.
+- Decide and record in DECISIONS.md: one shared Gitea org-level webhook vs per-repo webhooks.
+  Org-level is fewer moving parts; per-repo is more explicit. Default: per-repo via enroll script.
+
+### 4.2 Drone + the test target
+
+- Drone server connects to Gitea via OAuth app (Gitea → Settings → Applications). Runner is the
+  **exec runner** (or a privileged docker runner) running **on cc-ci itself**, because tests must
+  drive `abra` to deploy real recipes onto a real swarm.
+- cc-ci doubles as the **deploy target**: single-node Docker Swarm + Traefik, abra installed,
+  serving the `*.ci.commoninternet.net` wildcard, TLS terminated on cc-ci's Traefik using the
+  **pre-issued static wildcard cert** at `/var/lib/ci-certs/live/` (§4.0). The operator preconfigures
+  the wildcard DNS record (→ gateway), the gateway's TLS-passthrough to cc-ci, and the cert itself
+  (§4.4); the agent configures Traefik (file provider → that cert) and swarm on top — **no ACME**.
+- Each CI run gets an isolated app domain `<recipe>-pr<n>-<short-sha>.ci.commoninternet.net`
+  (§4.0) so concurrent runs don't collide. Teardown removes app, secrets, and volumes.
+- Consider a concurrency cap (1–2 deploys at a time) to avoid resource thrash; document it.
+
+### 4.3 The test harness & recipe test contract
+
+`runner/run_recipe_ci.py` orchestrates per run:
+1. Fetch recipe at `$REF` (the PR head) via abra/git.
+2. **Install stage** → `tests/<recipe>/test_install.py`: `abra app new`, generate secrets,
+   `abra app deploy`, wait healthy, run Playwright smoke + assertions.
+3. **Upgrade stage** → deploy previous published version first, then upgrade to `$REF`; assert
+   data survives and app still healthy.
+4. **Backup/restore stage** → `abra app backup`, mutate state, `abra app restore`, assert restored
+   state matches pre-mutation.
+5. **Recipe-local tests (D4)** → if `<recipe-repo>/tests/` exists, discover & run it in the same
+   live environment; merge results.
+6. **Teardown (always, even on failure)** → `abra app undeploy`, `abra app volume remove`,
+   `abra app secret remove`, DNS/route cleanup.
+
+Shared fixtures (`tests/conftest.py` + `runner/harness/`) wrap abra. **Known abra gotchas to bake
+in from day one** (carried over from prior work, re-verify on the installed abra version):
+- `abra app undeploy` and `abra app volume remove` do **not** accept `--chaos` → never pass it.
+- Plumb a `timeout` kwarg through secret-generate/insert/remove-all calls.
+- `abra app ls -S -m` returns nested `{server: {apps: [...]}}` — parse the inner structure.
+- Pick robust health checks per app (e.g. Keycloak: `/realms/master`, not `/`).
+
+The teardown guarantee is sacred: a failed test must never leak a deployed app or volume into the
+next run. Implement teardown as a pytest fixture finalizer / `try/finally` in the orchestrator and
+add a janitor pass at run start that nukes any orphaned `*-pr*` apps older than N hours.
+
+### 4.4 Secrets (D6)
+
+There are **two distinct classes of secret** and they are handled in opposite ways. Do not
+conflate them.
+
+**(A) Infra secrets.** All of these end up `sops-nix`-encrypted in `secrets/`, decrypt into the Nix
+store at activation, and are never world-readable. But they split into two sub-classes — see §1.5
+for the concrete locations/usage — and only the first sub-class blocks:
+
+- **(A1) External inputs — provided by the operator, the loop cannot create them.** The Tailscale
+  auth key + Gitea bot creds (`/srv/cc-ci/.testenv`, already provided), the **pre-issued wildcard
+  TLS cert** at `/var/lib/ci-certs/live/` (§4.0 — *not* a DNS token; the agent serves it, never
+  issues it), and **registry pull creds** (if needed). If one of these is **missing or invalid, the
+  loop is blocked** — write it to `STATUS.md ## Blocked` and stop (§9). The agent must not invent or
+  work around an external input it wasn't given, and must **not** attempt ACME/DNS-01 for
+  `commoninternet.net`.
+- **(A2) Internal secrets — the loop generates and manages these itself; never block on them.**
+  Drone RPC secret + webhook HMAC (`openssl rand`), the Gitea OAuth app for Drone (created via the
+  bot API), and the cc-ci host age/GPG key for sops. These are *not* human inputs; generate, store
+  in `secrets/`, and wire both ends.
+
+Alongside these, three **preconfigured network/cert facts** are operator-provided inputs the loop
+also depends on (not secrets the agent makes, but class-A in the same "provided, don't improvise"
+sense): (1) the wildcard `*.ci.commoninternet.net` record (and bare `ci.commoninternet.net`) already
+points at the **gateway**, (2) the gateway **TLS-passthroughs** that wildcard to cc-ci (SNI-routed,
+no decryption — see §4.0 Network path), and (3) the **pre-issued wildcard cert** is in place at
+`/var/lib/ci-certs/live/`. The operator owns the DNS record, the gateway, and cert issuance/renewal;
+**everything else on cc-ci is the agent's job** — Traefik (pointed at the static cert), swarm,
+per-run subdomain routing, and teardown. If the wildcard does not resolve, the gateway doesn't reach
+cc-ci, or the cert is missing/expired, that is a `## Blocked` condition (operator action), not
+something to work around (the gateway and DNS are not ours to reconfigure, per §9).
+
+**(B) Recipe app secrets — generated by the test, persisted within the run.** These are NOT a
+blocker and are NOT pre-provisioned by a human. The harness creates them itself for each app under
+test and is responsible for persisting them across the run so the multi-stage lifecycle works:
+
+- **Generate at install:** the harness runs `abra app secret generate` (+ inserts any deterministic
+  test fixtures like an admin password / test user it chooses) when it deploys the app.
+- **Persist for the run's duration:** the *same* generated secrets must survive across stages —
+  install → upgrade and especially **backup → restore** — because an app cannot be upgraded or
+  restored against rotated credentials. Persist them in a per-run secret store keyed by the run's
+  unique app name (e.g. `<recipe>-pr<n>-<sha>`): the live abra/swarm secrets plus a sidecar record
+  the harness writes (e.g. the app's `.env` + the generated values) to a run-scoped, non-public
+  location on the runner, so any stage can re-read them. They are emphemeral by design.
+- **Destroy at teardown:** the same teardown that removes the app/volumes also runs
+  `abra app secret remove` (with `timeout` plumbed) and deletes the per-run sidecar. Nothing
+  generated for a run outlives that run.
+- **How the harness should "figure out" persistence (acceptance for D6):** decide and document one
+  concrete mechanism — recommended default is "abra/swarm holds the live secrets; the harness keeps
+  a run-scoped sidecar file under a `runs/<app-name>/` dir on the runner (mode 600), and reloads
+  from it between stages." Whatever is chosen, it must (1) keep the same values stable across all
+  three stages, (2) isolate concurrent runs from each other, and (3) leave nothing behind.
+
+**(C) Drone CI tokens:** store as Drone org/repo secrets, referenced by the pipeline. Where a value
+is an external input (A1, e.g. registry creds) it is provided; where it is internal (A2) it is
+generated — see the (A) split above.
+
+Hard rule across all classes: scrub secrets from logs before they reach the dashboard; the results
+UI shows sanitized logs only. Add a redaction filter in the log pipeline and an Adversary test that
+greps published logs and the overview site for known secret patterns and any generated app
+password.
+
+### 4.5 Results UX (D7) — YunoHost-CI-like
+
+- **Per-run logs:** Drone's native UI already gives live, per-stage, tail-able logs and a final
+  status — use it as the canonical run view; the PR comment links to it.
+- **Overview page:** a small generator (`dashboard/`) polls the Drone API and renders a static
+  page at `ci.commoninternet.net` (§4.0): a table of enrolled recipes, latest run status badge
+  (pass/fail/running), last-tested version, link to history — mirroring the YunoHost app-list
+  feel. Served by Traefik; regenerated on build-completion webhook or a short timer.
+- Provide a status badge endpoint per recipe for embedding in recipe READMEs.
+
+---
+
+## 5. Milestones / initial BACKLOG
+
+Work top-down; each milestone ends with an **Adversary gate** (Adversary must independently
+verify the acceptance check before the next milestone starts). Seed `BACKLOG.md` from this.
+
+- **M0 — Foundations.** Repo created; flake builds; `nixos-rebuild` (or deploy-rs) applies a
+  no-op-then-base config to cc-ci; sops decrypts a test secret on the host.
+  *Accept:* `ssh cc-ci 'systemctl is-system-running'` healthy after a rebuild from the repo.
+- **M1 — Swarm + abra target.** Docker + single-node swarm + Traefik up; wildcard DNS + TLS;
+  abra can deploy and tear down a trivial recipe by hand.
+  *Accept:* a recipe deployed via abra is reachable over HTTPS at `*.ci.commoninternet.net`, then
+  fully torn down leaving no volumes.
+- **M2 — Drone online.** Drone server+runner via Nix, OAuth to Gitea; a hello-world `.drone.yml`
+  in cc-ci runs green; logs visible in Drone UI.
+  *Accept:* push to cc-ci triggers a visible green Drone build.
+- **M3 — Comment bridge.** `!testme` on a PR triggers a parameterized Drone build; bridge posts a
+  PR comment with the run link; non-`!testme` comments and non-collaborators are ignored.
+  *Accept:* live demo on a scratch PR — comment in, build out, link back, auth enforced.
+- **M4 — Harness + install stage.** `run_recipe_ci.py` + conftest; install stage green for one
+  simple recipe end-to-end with a Playwright assertion; guaranteed teardown.
+  *Accept:* full green install run for recipe #1, no orphaned app/volume afterward.
+- **M5 — Upgrade + backup/restore stages.** Add the other two stages for recipe #1.
+  *Accept:* upgrade preserves data; backup→mutate→restore returns original state.
+- **M6 — Recipe-local tests (D4) + second recipe.** Discover/run recipe-repo `tests/`; enroll a
+  second, DB-backed recipe via the documented flow.
+  *Accept:* both recipes green; recipe-local tests demonstrably executed and merged.
+- **M6.5 — Breadth ramp.** Enroll recipes 3→6 covering the remaining D10 categories, one at a
+  time, each via the documented enroll flow (this is the real test of D5: enrolling recipe N
+  should be template-copy + recipe-specific tests/fixtures, with **no harness surgery**). Expect
+  per-recipe quirks — multi-service deps, S3/MinIO config, SSO client setup, TLS passthrough,
+  large-volume backups — and absorb them into the *shared* harness, not one-off per-recipe hacks.
+  When flakiness appears, add real readiness/wait robustness to the harness rather than sprinkling
+  `sleep`s. Run benchmarks/long deploys **sequentially**, never in parallel (network contention).
+  *Accept:* recipes 3–6 each have a full three-stage green run; enrolling N≥3 needed no changes to
+  shared harness code.
+- **M7 — Secrets hardening (D6).** Full sops model, rotation doc, log redaction + leak test.
+  *Accept:* Adversary's secret-grep over published logs finds nothing; rotation doc followed.
+- **M8 — Dashboard (D7).** Overview page + badges + PR-comment outcome reflection.
+  *Accept:* overview matches reality across several runs; outcomes mirrored to PR comments.
+- **M9 — Reproducibility + docs (D8/D9).** `docs/install.md` rebuilds the server from scratch on a
+  blank VM; all docs complete.
+  *Accept:* Adversary rebuilds from docs onto a throwaway host (or records the tested subset).
+- **M10 — Proof (D10).** All six chosen recipes green via real `!testme` PRs (the breadth set from
+  M6/M6.5 carried through the hardened pipeline), each with install/upgrade/backup-restore
+  exercised and Adversary-verified; flip `STATUS.md` to DONE.
+
+---
+
+## 6. The two agents
+
+### Builder (primary)
+Implements the backlog top-down. Discipline:
+- One backlog item in flight at a time. Small, committed, reversible steps.
+- Every change verified against the *real* system (server, Drone, Gitea) before claiming done —
+  never "should work". Paste the verifying command + output into `JOURNAL.md`.
+- Touch production carefully: cc-ci is the only target; never deploy test apps onto unrelated
+  production servers; never reuse production domains. Idempotent server changes only (via Nix).
+- If blocked on access/secrets/external state, write it to `STATUS.md ## Blocked` and pick up an
+  unblocked item rather than hacking around it.
+
+### Adversary (reviewer)
+Runs as a **separate, independent loop in its own process/sandbox** (see §6.1 for how the two
+loops coordinate). Its job is to **disbelieve**. It:
+- Re-verifies each `Definition of Done` and milestone-acceptance claim independently, from a cold
+  start (fresh shell, own clone, no cached state), and logs PASS/FAIL + evidence in `REVIEW.md`.
+- Actively tries to break things: comment `!testmexyz` (should NOT trigger), comment as a
+  non-collaborator (should be rejected), push a PR that fails tests (must report red, not green),
+  kill an app mid-run (teardown must still clean up), grep published logs/dashboard for secrets,
+  run two `!testme`s concurrently (no domain/volume/secret collision), confirm the same generated
+  app secrets persist across install→upgrade→backup/restore.
+- Files every defect as a `BACKLOG.md` item tagged `[adversary]` with repro steps. The Builder
+  may not close an adversary item; only the Adversary closes it after re-test.
+- Has veto power over `STATUS.md → DONE`.
+
+### 6.1 Coordination protocol (two independent loops, one shared repo)
+
+The two loops never talk directly; the **git repo is the only coordination medium**. Each agent
+has its own clone (e.g. Builder in `/srv/cc-ci/cc-ci`, Adversary in `/srv/cc-ci/cc-ci-adv`) and
+its own pacing. To make concurrent writes conflict-free:
+
+- **File ownership (one writer each — the other only reads):**
+  - Builder owns: all source code/config, `STATUS.md`, `JOURNAL.md`, `DECISIONS.md`.
+  - Adversary owns: `REVIEW.md`.
+  - `BACKLOG.md` is split into two H2 sections — `## Build backlog` (Builder-only) and
+    `## Adversary findings` (Adversary-only). Each agent edits **only its own section**, so git
+    merges the two cleanly. Closing an item = checking the box *in your own section*; the Builder
+    fixes an `[adversary]` finding and notes the fix in JOURNAL, but only the Adversary ticks it
+    closed after re-test.
+- **Append-only where possible.** `JOURNAL.md` and `REVIEW.md` are append-only logs → they never
+  conflict. Prefer appending over rewriting.
+- **Git discipline (both loops, every write):** `git pull --rebase` before editing, make the
+  smallest change, commit, `git push`. On a rebase conflict, it will be inside the *other* agent's
+  file/section only if a rule was broken — re-pull and keep to your own files. Never `--force`.
+- **Gate handshake via STATUS.md.** When the Builder believes a milestone gate is met, it sets in
+  `STATUS.md`: `Gate: <Mn> — CLAIMED, awaiting Adversary` and stops advancing past it. The
+  Adversary, on its next wake, sees the claim, runs the acceptance check cold, and writes the
+  verdict to `REVIEW.md` (`<Mn>: PASS @<ts>` with evidence, or `FAIL` + an `[adversary]` item).
+  The Builder only proceeds past the gate after seeing `PASS` in `REVIEW.md`.
+- **DONE handshake.** Builder may write `## DONE` to `STATUS.md` **only** when `REVIEW.md` shows a
+  PASS dated within 24h for every D1–D10. The Adversary can write `## VETO <reason>` to
+  `REVIEW.md` at any time, which forbids DONE until cleared.
+- **Liveness.** If the Adversary sees a gate `CLAIMED` for too long with no Builder progress, or
+  the Builder sees no Adversary verdict on a standing claim, note it in your own ledger and keep
+  doing independent work — neither loop blocks idle waiting on the other beyond its gate.
+
+(If you are ever forced to run with a single process, the degraded fallback is to alternate
+roles per iteration and keep `JOURNAL.md` and `REVIEW.md` strictly separate — but two loops is
+the intended design.)
+
+---
+
+## 7. The Loop Protocol
+
+Both loops run this same shape; state lives in the repo so it survives restarts/compaction. On
+every wake, `git pull --rebase` first, then:
+
+1. **Orient.** Read `STATUS.md` (phase, in-flight item, gate claims, blockers), `BACKLOG.md`, and
+   the tail of `REVIEW.md`. Reconcile with reality via cheap probes (Drone health, last build,
+   `git log`) — never trust the ledger blindly; if it disagrees with the system, fix the ledger
+   first (your own files only — see §6.1).
+2. **Select.**
+   - *Builder:* highest-priority open item in `## Build backlog`: unresolved `[adversary]`
+     findings > current milestone's next task > next milestone. Never advance past a milestone gate
+     until `REVIEW.md` shows its PASS.
+   - *Adversary:* any standing `Gate: <Mn> CLAIMED` in `STATUS.md` to verify > re-verify a D1–D10
+     gate whose last PASS is stale (>24h) > a fresh break-it probe from §6.
+3. **Act.** Smallest change that advances the item. Builder verifies against the real system;
+   Adversary verifies from a cold start. Commit with a clear message (author per repo convention).
+4. **Record (your own files only).** *Builder:* append to `JOURNAL.md` (what you did + verifying
+   command/output + next), update `STATUS.md`, tick `## Build backlog`. *Adversary:* append PASS/
+   FAIL + evidence to `REVIEW.md`, add/close items in `## Adversary findings`. Then `git push`.
+5. **Gate handshake (§6.1).** Builder, on reaching a milestone, sets `Gate: <Mn> CLAIMED, awaiting
+   Adversary` in `STATUS.md` and works on other unblocked items meanwhile. Adversary clears it with
+   a `REVIEW.md` verdict. No gate is "passed" without a logged PASS.
+6. **Decide continuation.** Builder writes `## DONE` only when `REVIEW.md` shows a <24h PASS for
+   every D1–D10 and no standing `## VETO`. Otherwise schedule the next wake.
+
+**Pacing.** Use `/loop` (self-paced) or `ScheduleWakeup`. Most waits here are for things the
+harness can't notify you about — a Drone build, a `nixos-rebuild`, a deploy converging — so poll
+the *specific* thing: while a build/deploy is in flight, re-check on a short cadence (≈4 min) to
+stay cache-warm; when genuinely idle between iterations, sleep longer (20–30 min). Don't burn
+iterations spinning on a build that takes minutes.
+
+**Anti-drift guards.**
+- Cap retries: if an approach fails 3× the same way, stop, write the dead-end in `DECISIONS.md`,
+  and try a different approach or mark blocked. No thrashing.
+- Never weaken a test to make it pass. A red test is information; "fix" the recipe/harness or file
+  a finding — do not delete the assertion. (This is the single most important rule; the Adversary
+  watches specifically for tests being softened or skipped.)
+- Keep changes reversible; prefer Nix-declared state over imperative server edits so any rebuild
+  reproduces it.
+- Don't expand scope beyond §2. New ideas → `BACKLOG.md` (tagged `[idea]`), not into this run.
+
+---
+
+## 8. Open decisions to settle early (log in DECISIONS.md)
+
+- Deploy mechanism: `nixos-rebuild --target-host` vs `deploy-rs`/`colmena`. (Default: deploy-rs
+  for atomic rollbacks; nixos-rebuild fine if simpler.)
+- Webhook scope: per-repo vs org-level Gitea webhook. (Default: per-repo via enroll script.)
+- Drone runner type: exec vs privileged docker. (Default: exec, since it must drive host abra.)
+- Secret tool: sops-nix vs agenix. (Default: sops-nix for multi-recipient + yaml ergonomics.)
+- Wildcard TLS: **SETTLED — operator pre-issues a wildcard cert; the agent serves it statically, no
+  token** (§4.0). The operator issued a wildcard SAN cert (`*.ci.commoninternet.net` +
+  `ci.commoninternet.net`) via LE DNS-01/Gandi out-of-band and placed it at
+  `/var/lib/ci-certs/live/`; the agent configures Traefik's file provider to serve it and runs no
+  ACME for this domain. Chosen so the DNS-editing token never enters the repo/agent. **Manual
+  renewal** every ~90 days (next ~2026-08-24) — operator re-issues and replaces the files in place.
+- Proof recipe set (D10 — six, category-spanning). Default candidates, all previously verified
+  deployable: `hedgedoc`, `cryptpad`, `keycloak`, `authentik`, `lasuite-docs`/`lasuite-drive`,
+  `matrix-synapse`, `immich`, `bluesky-pds`. Lock the final six early so M4–M6.5 build toward them.
+  Sequence easy→hard: prove the pipeline on `hedgedoc`/`cryptpad` before tackling SSO, S3, media
+  stores, and TLS-passthrough recipes.
+
+Each default stands until the Adversary or reality forces a change; record the change and why.
+
+---
+
+## 9. Guardrails / hard rules
+
+- **Access boundary:** only cc-ci is yours to reconfigure. Recipe repos: read + comment + (when
+  enrolling) add a webhook — nothing else. Never push to a recipe repo's code.
+- **No secrets in git/logs/UI.** Ever. Verified by the Adversary's leak test.
+- **No mocks for the e2e stages.** D2 means real deploys. If something can't be tested for real,
+  it's a finding, not a pass.
+- **Idempotent + reversible.** Anything done to the server must be re-derivable from the repo.
+- **Stop on missing *external* infra inputs** (class-A1 in §4.4: cc-ci SSH/root access, the
+  Tailscale auth key, Gitea bot creds, the pre-issued wildcard cert at `/var/lib/ci-certs/live/`,
+  registry creds — and the preconfigured DNS/gateway facts) rather than improvising around them;
+  surface in `STATUS.md ## Blocked`. **Never** attempt ACME/DNS-01 for `commoninternet.net` — the
+  cert is pre-provided and renewed out-of-band by the operator. **This does NOT apply to** internal infra secrets (class-A2: Drone RPC,
+  webhook HMAC, Gitea OAuth app, host age key — the agent generates these) or to recipe app secrets
+  (class-B): those the test harness generates itself (`abra app secret generate` + chosen fixtures),
+  persists for the run, and destroys at teardown — a missing app secret is never a blocker, it is
+  something the harness
+  creates. See §4.4.
+- **Honest reporting.** If a stage is skipped or a check failed, say so in `STATUS.md`/`JOURNAL.md`
+  with the output. The loop's value depends entirely on the ledgers being true.
diff --git a/cc-ci-plan/prompts/adversary.md b/cc-ci-plan/prompts/adversary.md
new file mode 100644
index 0000000..f024a34
--- /dev/null
+++ b/cc-ci-plan/prompts/adversary.md
@@ -0,0 +1,19 @@
+You are the Adversary agent for cc-ci — one of two independent loops. Your job is to DISBELIEVE the Builder. Read /srv/cc-ci/cc-ci-plan/plan.md in full, especially §2, §6, §6.1, and §9.
+
+Start a self-paced loop now: invoke `/loop` with no interval so you re-wake yourself via ScheduleWakeup. Pace yourself: poll short (~4m) while watching a CLAIMED gate or a running build; sleep 20–30m when idle. Keep running independent break-it probes even when no gate is pending. Stop only when STATUS.md says ## DONE and you have logged a fresh PASS for every D1–D10.
+
+Credentials/access: §1.5 is the authoritative map. Provided creds are in /srv/cc-ci/.testenv and ~/.ssh; reach cc-ci with `ssh cc-ci` (root, via the userspace-tailscaled SOCKS proxy on 127.0.0.1:1055), and hit the dashboard / *.ci.commoninternet.net through that proxy (`curl --proxy socks5h://localhost:1055 ...`). If the proxy is down, restart it per §1.5. Verify from a COLD START but you may rely on this shared access path.
+
+You run as a SEPARATE process and coordinate ONLY through the git repo per §6.1:
+- Keep your OWN clone at /srv/cc-ci/cc-ci-adv. If the repo doesn't exist yet, wait and retry on your next wake — the Builder creates it during §1 Bootstrap.
+- git pull --rebase before every edit; commit; push; never --force.
+- Write ONLY your files: REVIEW.md and the "## Adversary findings" section of BACKLOG.md. Everything else (code, STATUS.md, JOURNAL.md, "## Build backlog") is read-only to you.
+
+Each wake:
+1. Pull. Read STATUS.md for any "Gate: <Mn> CLAIMED, awaiting Adversary".
+2. Verify claims from a COLD START (fresh shell, your own clone, no cached state). Re-run the milestone/D-gate acceptance check yourself; do not trust the Builder's word.
+3. Actively try to break things: !testmexyz must NOT trigger; non-collaborator comments rejected; a failing PR must report RED; killing an app mid-run still leaves clean teardown; published logs AND the dashboard contain no secrets (incl. generated app passwords); two concurrent !testme runs don't collide on domain/volume/secrets; the SAME generated app secrets persist across install → upgrade → backup/restore.
+4. Record verdicts in REVIEW.md ("<Mn>: PASS @<ts>" + evidence, or FAIL). File each defect as a "## Adversary findings" item tagged [adversary] with repro steps. Only YOU close those, after re-test. You hold veto power: write "## VETO <reason>" to REVIEW.md to forbid DONE until cleared.
+5. Push. Schedule the next wake.
+
+Begin: read /srv/cc-ci/cc-ci-plan/plan.md, then enter the self-paced loop (start by cloning the repo to /srv/cc-ci/cc-ci-adv if it exists yet).
diff --git a/cc-ci-plan/prompts/builder.md b/cc-ci-plan/prompts/builder.md
new file mode 100644
index 0000000..d032790
--- /dev/null
+++ b/cc-ci-plan/prompts/builder.md
@@ -0,0 +1,25 @@
+You are the Builder agent for the cc-ci project — one of two independent loops. Your job is to build a Co-op Cloud recipe CI server, working autonomously over multiple days.
+
+Single source of truth: /srv/cc-ci/cc-ci-plan/plan.md. Read it in full now, then begin at §1 Bootstrap. The original brief /srv/cc-ci/cc-ci-plan/brief.md is context only — do not edit it.
+
+Start a self-paced loop now: invoke `/loop` with no interval so you re-wake yourself via ScheduleWakeup. Each iteration = one unit of work (see §7). Pace per §7: poll ~4m while a build/deploy/rebuild is in flight to stay cache-warm; sleep 20–30m when genuinely idle or parked at a gate. Do NOT spin on a build that takes minutes. Stop the loop only when STATUS.md says ## DONE.
+
+You run as a SEPARATE process from the Adversary loop and coordinate ONLY through the git repo per §6.1:
+- git pull --rebase before every edit; make the smallest change; commit; git push. Never --force.
+- Write ONLY your files: source/config, STATUS.md, JOURNAL.md, DECISIONS.md, and the "## Build backlog" section of BACKLOG.md. Treat REVIEW.md and "## Adversary findings" as read-only — the Adversary owns them.
+- At each milestone gate, set "Gate: <Mn> CLAIMED, awaiting Adversary" in STATUS.md and work other unblocked items; do NOT advance past the gate until REVIEW.md shows its PASS.
+- Write "## DONE" only when REVIEW.md shows a PASS dated <24h for every D1–D10 and there is no standing "## VETO".
+
+Overriding rules:
+- "Done" is defined ONLY by §2 (D1–D10), Adversary-verified. No self-certifying.
+- Verify every change against the real server/Drone/Gitea; paste command + output into JOURNAL.md. No "should work."
+- Never weaken, skip, or delete a test to make a run pass. A red test is information.
+- Only cc-ci is yours to reconfigure. Never push code to recipe repos; never touch production servers/domains. Keep server state Nix-declared and reversible.
+- 3rd identical failure → stop, record dead-end in DECISIONS.md, change approach or mark blocked.
+- Credentials: §1.5 is the authoritative map. Provided creds are in /srv/cc-ci/.testenv (TS_AUTH_KEY, GITEA_USERNAME/PASSWORD/URL) and ~/.ssh (cc-ci-root-ed25519). Reach cc-ci with `ssh cc-ci` (root, via the userspace-tailscaled SOCKS proxy on 127.0.0.1:1055); if it fails, restart the proxy per §1.5 before declaring blocked. There is NO ready-made $GITEA_TOKEN — mint one from the bot creds if you want a token.
+- Secret classes (§4.4), handled differently:
+  • Class A1 EXTERNAL infra inputs (cc-ci SSH/root access, TS auth key, Gitea bot creds, the pre-issued wildcard TLS cert at /var/lib/ci-certs/live/, registry creds; plus the preconfigured DNS/gateway facts): if missing/invalid → STATUS.md ## Blocked and stop. Do NOT improvise/invent. NEVER attempt ACME/DNS-01 for commoninternet.net — the cert is pre-provided and renewed out-of-band; point Traefik's file provider at /var/lib/ci-certs/live/{fullchain.pem,privkey.pem}.
+  • Class A2 INTERNAL infra secrets (Drone RPC, webhook HMAC, Gitea OAuth app, host age key): you GENERATE these yourself — never block on them.
+  • Class B RECIPE APP secrets: NOT a blocker. The harness generates them (abra app secret generate + chosen fixtures), persists them per-run so the SAME values survive install → upgrade → backup/restore, and destroys them at teardown.
+
+Begin: read /srv/cc-ci/cc-ci-plan/plan.md, then execute §1 Bootstrap, then enter the self-paced loop.
diff --git a/references/recipe-maintainer b/references/recipe-maintainer
new file mode 120000
index 0000000..882e8a1
--- /dev/null
+++ b/references/recipe-maintainer
@@ -0,0 +1 @@
+/srv/recipe-maintainer/
\ No newline at end of file