Initial commit: cc-ci autonomous orchestrator
Planning + launch + setup material for the cc-ci Co-op Cloud recipe CI server: plan.md (single source of truth), kickoff/launch supervision, and the Builder/Adversary loop prompts. Secrets (.testenv) and runtime dirs are gitignored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
12
.gitignore
vendored
Normal file
12
.gitignore
vendored
Normal file
@ -0,0 +1,12 @@
|
||||
# Secrets — NEVER commit
|
||||
.testenv
|
||||
*.tfstate
|
||||
*.tfstate.*
|
||||
*.key
|
||||
*.pem
|
||||
|
||||
# Loop runtime / working clones (created at launch by launch.sh)
|
||||
/cc-ci/
|
||||
/cc-ci-adv/
|
||||
/.cc-ci-watch/
|
||||
/.cc-ci-logs/
|
||||
34
cc-ci-plan/README.md
Normal file
34
cc-ci-plan/README.md
Normal file
@ -0,0 +1,34 @@
|
||||
# cc-ci-plan
|
||||
|
||||
Self-contained handoff package for building the **cc-ci** Co-op Cloud recipe CI server with two
|
||||
autonomous Claude loops (a Builder and an adversarial Reviewer) running over days.
|
||||
|
||||
## Start here
|
||||
|
||||
1. Read **`plan.md`** — the full plan and single source of truth (mission, Definition of Done,
|
||||
architecture, milestones, the two-agent coordination protocol, loop discipline).
|
||||
2. Read **`kickoff.md`** — how to launch and supervise the loops.
|
||||
3. Run **`./launch.sh start`** to bring up both loops + the watchdog.
|
||||
|
||||
## Files
|
||||
|
||||
| File | Purpose |
|
||||
|---|---|
|
||||
| `plan.md` | The plan. Agents treat it as their single source of truth. |
|
||||
| `brief.md` | The original one-page brief (context only; `plan.md` supersedes it). |
|
||||
| `kickoff.md` | Launch & supervision guide. |
|
||||
| `launch.sh` | Starts both loops + a watchdog; restarts dead loops; stops on `## DONE`. |
|
||||
| `prompts/builder.md` | Builder loop prompt (fed to `claude` by the script). |
|
||||
| `prompts/adversary.md` | Adversary loop prompt. |
|
||||
|
||||
## Before launching
|
||||
|
||||
- Set the org in `plan.md` (`git.autonomic.zone/recipe-maintainers/cc-ci`) and lock the six proof recipes (§8).
|
||||
- Ensure the launching shell has: SSH+sudo to `cc-ci`, the Gitea token, `git.autonomic.zone` access.
|
||||
- Preconfigure test-app DNS + TLS (plan §4.0): point a wildcard `*.ci.commoninternet.net` record at a gateway that TLS-passthroughs to cc-ci, and **pre-issue the wildcard cert** (`*.ci.commoninternet.net` + `ci.commoninternet.net`, via Gandi DNS-01) into `/var/lib/ci-certs/live/` on cc-ci. The agent handles everything else on cc-ci (Traefik file provider → that cert, swarm, routing) and does **no ACME**; renewal (~90 days) is an out-of-band operator task, so the DNS token never goes to the agent.
|
||||
- `export CC_CI_REPO=https://git.autonomic.zone/recipe-maintainers/cc-ci.git` so the watchdog can detect `## DONE`.
|
||||
|
||||
## What "done" means
|
||||
|
||||
The loops stop only when all of `plan.md` §2 (D1–D10) hold **and** the Adversary has independently
|
||||
re-verified each within 24h. The watchdog then tears the loops down automatically.
|
||||
35
cc-ci-plan/brief.md
Normal file
35
cc-ci-plan/brief.md
Normal file
@ -0,0 +1,35 @@
|
||||
we are working on making a CI server
|
||||
|
||||
I want you to work in an autonomous loop over the next few days until the CI server is fully functional, polished and documented
|
||||
|
||||
on any PR on git.autonomic.zone it should be invokable by writing !testme as a comment
|
||||
|
||||
this should invoke the set of CI tests to be run for the recipe code at that PR
|
||||
|
||||
the CI tests should be run via drone
|
||||
|
||||
the tests run for a recipe should be written in python. e2e testing via playwright should be used whe necessary to confirm functionality
|
||||
|
||||
there should be tests which test
|
||||
- new install
|
||||
- upgrade
|
||||
- backups (including restore)
|
||||
|
||||
all the tests should be fully e2e, with a real deployed recipe
|
||||
|
||||
the CI runner should be deployed on a server called cc-ci which is running nixos
|
||||
|
||||
cc-ci git repo should also live on git.autonomic.zone which contains all the nix configuration for the server, as well as the code for the CI test runner
|
||||
|
||||
the CI test runner should have its own folder of tests, with one folder for each recipe, with each of those folders containg a set of tests as python files which get invoked for that recipe
|
||||
|
||||
secrets should also be handled in a reasonable and repeatable way
|
||||
|
||||
additionally, if a recipe repo itself contains a tests folder in the recipe, the CI runner should also invoke those tests as part of the CI run for those tests
|
||||
|
||||
the results of the test run should be easily viewable, with trackable logs, and a final result, very similar in style to the way the yunohost CI runner looks and feels
|
||||
|
||||
you will have ssh access to cc-ci server, as well as sudo access there
|
||||
|
||||
you will also have access to create and modify repos on git.autonomic.zone
|
||||
|
||||
143
cc-ci-plan/kickoff.md
Normal file
143
cc-ci-plan/kickoff.md
Normal file
@ -0,0 +1,143 @@
|
||||
# cc-ci — Kickoff & Launch
|
||||
|
||||
Everything needed to start the autonomous cc-ci build loop. The substance lives in `plan.md`;
|
||||
this file explains how to launch and supervise the two agents.
|
||||
|
||||
## Folder contents
|
||||
|
||||
```
|
||||
cc-ci-plan/
|
||||
├── plan.md # THE plan — single source of truth (read this in full)
|
||||
├── brief.md # original one-page brief (context only; superseded by plan.md)
|
||||
├── kickoff.md # this file — how to launch & supervise
|
||||
├── launch.sh # starts both loops + watchdog, stops on ## DONE
|
||||
└── prompts/
|
||||
├── builder.md # Builder loop prompt (fed to claude by launch.sh)
|
||||
└── adversary.md # Adversary loop prompt
|
||||
```
|
||||
|
||||
> Note: `/srv/cc-ci/cc-ci-plan/` (this folder) is the **planning + launch material**. The actual
|
||||
> CI project — NixOS config, runner, tests — lives in a **separate git repo** the Builder creates
|
||||
> at `git.autonomic.zone/recipe-maintainers/cc-ci`, cloned to `/srv/cc-ci/cc-ci` (Builder) and
|
||||
> `/srv/cc-ci/cc-ci-adv` (Adversary). Don't confuse the two.
|
||||
|
||||
## Model: two independent loops (plan §6 / §6.1)
|
||||
|
||||
- **Builder** — builds the CI server; owns code + `STATUS.md`/`JOURNAL.md`/`DECISIONS.md` + the
|
||||
`## Build backlog` section of `BACKLOG.md`.
|
||||
- **Adversary** — independently disbelieves and re-verifies; owns `REVIEW.md` + `## Adversary
|
||||
findings`. Holds veto over `## DONE`.
|
||||
|
||||
They run as two separate processes and coordinate **only** through the git repo. Single-writer file
|
||||
ownership keeps concurrent pushes merge-clean.
|
||||
|
||||
## Two layers of "looping" — and why you want both
|
||||
|
||||
| Concern | Mechanism | Who provides it |
|
||||
|---|---|---|
|
||||
| **Iteration** — keep doing one unit of work, then wake again | `/loop` self-paced (ScheduleWakeup), per plan §7 pacing | each agent, in-session |
|
||||
| **Resilience** — restart a loop whose process/sandbox died; stop all on `## DONE` | `launch.sh` watchdog (tmux + git poll) | this script |
|
||||
|
||||
`/loop` alone is bound to its process: if the sandbox restarts, that loop is gone until something
|
||||
relaunches it. The watchdog is that something. Use both.
|
||||
|
||||
## Launch
|
||||
|
||||
```bash
|
||||
cd /srv/cc-ci/cc-ci-plan
|
||||
|
||||
# Optional but recommended once the repo exists, so the watchdog can detect ## DONE:
|
||||
export CC_CI_REPO=https://git.autonomic.zone/recipe-maintainers/cc-ci.git
|
||||
|
||||
./launch.sh start # starts cc-ci-builder + cc-ci-adv + cc-ci-watchdog (tmux sessions)
|
||||
./launch.sh status # session + DONE state
|
||||
./launch.sh logs builder # tail a loop; also: logs adversary | logs watchdog
|
||||
tmux attach -t cc-ci-builder # watch a loop live locally (detach: Ctrl-b d)
|
||||
./launch.sh stop # stop everything
|
||||
```
|
||||
|
||||
`launch.sh` is idempotent — re-running `start` won't duplicate a live session. Each agent runs as an
|
||||
**interactive** `claude` in tmux (kickoff prompt passed as a positional arg, *not* piped — piping
|
||||
forces print mode and breaks `/loop`). With `REMOTE_CONTROL=1` (default) each agent is launched with
|
||||
`--remote-control`, so you can **watch and steer both loops from [claude.ai/code](https://claude.ai/code)**
|
||||
(or the Claude mobile app) — not just via `tmux attach`. The box must be logged into the claude.ai
|
||||
account (`claude auth status`); set `REMOTE_CONTROL=0` to skip the remote surface. The watchdog
|
||||
(default every 300s) restarts any dead session — note a >~10-min network outage will exit the
|
||||
`claude` process, after which the watchdog brings it back (a fresh remote-control session) — and
|
||||
when `STATUS.md` shows `## DONE`, it kills the loops and exits.
|
||||
|
||||
Prerequisites the sessions inherit from your shell: SSH (root) to `cc-ci` via the Tailscale proxy
|
||||
(§1.5), Gitea bot creds, and `git.autonomic.zone` access. Plus **preconfigured** operator inputs the
|
||||
loop depends on (plan §4.0/§4.4): the wildcard `*.ci.commoninternet.net` DNS record pointing at a
|
||||
gateway that TLS-passthroughs to cc-ci, and the **pre-issued wildcard cert** at
|
||||
`/var/lib/ci-certs/live/` on cc-ci. The operator owns the DNS record + gateway + cert
|
||||
issuance/renewal; the agent builds Traefik (file provider → that cert) + routing on cc-ci and does
|
||||
**no ACME**. If any prerequisite is absent, the Builder parks at `STATUS.md ## Blocked` (plan §1/§9)
|
||||
rather than improvise.
|
||||
|
||||
> Host deps: `launch.sh` needs **tmux** (and `claude`) — tmux is installed on this sandbox host
|
||||
> (3.5a). On a fresh host: `sudo apt-get install -y tmux`. The script's `*_DIR`
|
||||
> defaults now point at `/srv/cc-ci/...` (Builder clone `/srv/cc-ci/cc-ci`, Adversary
|
||||
> `/srv/cc-ci/cc-ci-adv`); override the `*_DIR` env vars only if your layout differs.
|
||||
|
||||
## Optional: a cloud-side `/schedule` watchdog
|
||||
|
||||
`launch.sh`'s watchdog is itself a local process — if the *whole host* goes down it stops too. For
|
||||
belt-and-suspenders durability, also create a `/schedule` routine (a remote agent that fires on a
|
||||
cron and re-orients from the repo). From inside a Claude session:
|
||||
|
||||
```
|
||||
/schedule every 2 hours: read /srv/cc-ci/cc-ci-plan/plan.md §7 and the cc-ci repo STATUS.md; if the
|
||||
Builder/Adversary loops are not making progress (or launch.sh is not running), restart them via
|
||||
/srv/cc-ci/cc-ci-plan/launch.sh start; stop when STATUS.md says ## DONE.
|
||||
```
|
||||
|
||||
This complements the local watchdog: scheduled runs are fresh, independent agents, so they survive
|
||||
process/context death that would take the in-session `/loop` and the local watchdog with it.
|
||||
|
||||
## Fallback: restart/recreate the cc-ci VM (orchestrator only)
|
||||
|
||||
**This is primarily an escape hatch for *you*, the supervising orchestrator.** The loops normally
|
||||
reconfigure cc-ci only from inside (via Nix); power-cycling or recreating the VM shouldn't be their
|
||||
default move — but it's not forbidden if one gets genuinely stuck. Reach for this when cc-ci itself
|
||||
is wedged at a level that can't be fixed from inside (won't boot, disk full, swarm/Docker corrupted,
|
||||
unreachable even after a proxy restart): use the Incus skill to power-cycle or rebuild the VM, then
|
||||
re-bootstrap.
|
||||
|
||||
`cc-nix-test` (the cc-ci server, tailnet `100.90.116.4`) is a **NixOS Incus VM** on host **b1**
|
||||
(`100.117.251.31:8443`, Incus project `terraform-ci`). Skill + Terraform live at
|
||||
`/srv/incus-terraform-nix-vm-creator/` (`skills/incus-terraform/SKILL.md`); read that for full usage.
|
||||
|
||||
- **Access:** b1 is on the *same* cc-ci tailnet, so reach the Incus API through the existing
|
||||
`cc-ci-tailscaled` SOCKS proxy (`127.0.0.1:1055`) with the mTLS certs in that repo's
|
||||
`terraform-secrets/` — no second tailscaled needed. Quick check:
|
||||
```bash
|
||||
CRT=/srv/incus-terraform-nix-vm-creator/terraform-secrets/terraform.crt
|
||||
KEY=/srv/incus-terraform-nix-vm-creator/terraform-secrets/terraform.key
|
||||
curl --proxy socks5h://localhost:1055 --cert "$CRT" --key "$KEY" -k -s \
|
||||
https://100.117.251.31:8443/1.0/instances/cc-nix-test/state?project=terraform-ci
|
||||
```
|
||||
- **Soft restart (keeps the disk — preferred):** `POST .../1.0/instances/cc-nix-test/state?project=terraform-ci`
|
||||
with `{"action":"restart"}` (or `"stop"` / `"start"`).
|
||||
- **Full recreate (last resort):** the Terraform module in `/srv/incus-terraform-nix-vm-creator/projects/`
|
||||
(`terraform apply` with `-var incus_remote_address=100.117.251.31 -var incus_project=terraform-ci
|
||||
-var ts_auth_key=$TSKEY`). ⚠ **Recreating wipes the VM disk** — you must then re-apply the cc-ci
|
||||
preconditions: the pre-issued TLS cert into `/var/lib/ci-certs/live/` and the
|
||||
`cc-ci-root-ed25519` pubkey into root's `authorized_keys` (see the access notes), and the loops
|
||||
re-run §1 Bootstrap. Prefer a soft restart; only recreate if the VM is truly unrecoverable.
|
||||
|
||||
(Project cap: keep total RAM across `terraform-ci` instances under 10 GB — check before recreating.)
|
||||
|
||||
## Manual launch (no script)
|
||||
|
||||
If you'd rather not use `launch.sh`, start each agent interactively yourself (same result, no
|
||||
supervision/restart), passing the prompt as a positional argument so the session stays interactive
|
||||
and remote-controllable:
|
||||
|
||||
```bash
|
||||
claude --remote-control 'cc-ci-builder' --dangerously-skip-permissions "$(cat prompts/builder.md)"
|
||||
claude --remote-control 'cc-ci-adv' --dangerously-skip-permissions "$(cat prompts/adversary.md)"
|
||||
```
|
||||
|
||||
Do **not** pipe the prompt (`cat prompts/builder.md | claude …`) — that forces print/headless mode,
|
||||
which breaks `/loop` and remote control.
|
||||
203
cc-ci-plan/launch.sh
Executable file
203
cc-ci-plan/launch.sh
Executable file
@ -0,0 +1,203 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# launch.sh — start and supervise the two cc-ci autonomous loops + a watchdog.
|
||||
#
|
||||
# Model (see plan.md §6 / §6.1): two INDEPENDENT Claude Code sessions —
|
||||
# • Builder (tmux session: cc-ci-builder) working clone /srv/cc-ci/cc-ci
|
||||
# • Adversary (tmux session: cc-ci-adv) working clone /srv/cc-ci/cc-ci-adv
|
||||
# coordinating only through the git repo on git.autonomic.zone.
|
||||
#
|
||||
# Each agent self-paces with a `/loop` (ScheduleWakeup) — that handles ITERATION.
|
||||
# This script's watchdog handles RESILIENCE: it restarts a session that has died
|
||||
# and stops everything once STATUS.md reports "## DONE".
|
||||
#
|
||||
# Usage:
|
||||
# ./launch.sh start # start both loops + watchdog (idempotent)
|
||||
# ./launch.sh watchdog # run only the supervision loop in the foreground
|
||||
# ./launch.sh status # show session + DONE state
|
||||
# ./launch.sh logs builder|adversary|watchdog # tail a session/log
|
||||
# ./launch.sh stop # stop both loops + watchdog
|
||||
#
|
||||
# Configure via env vars (defaults below). At minimum set CC_CI_REPO once the
|
||||
# Builder has created the repo, so the watchdog can detect DONE.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
# ----- config -------------------------------------------------------------
|
||||
PLAN_DIR="${PLAN_DIR:-/srv/cc-ci/cc-ci-plan}"
|
||||
CLAUDE_BIN="${CLAUDE_BIN:-claude}"
|
||||
# Flags for unattended operation in a sandbox. Override if your setup differs.
|
||||
CLAUDE_FLAGS="${CLAUDE_FLAGS:---dangerously-skip-permissions}"
|
||||
# REMOTE_CONTROL=1 launches each agent as an INTERACTIVE session with --remote-control,
|
||||
# viewable/steerable at claude.ai/code (and the Claude mobile app). This is required for
|
||||
# /loop + ScheduleWakeup to work at all (they are interactive-only — a piped/print-mode
|
||||
# session cannot self-pace). Set REMOTE_CONTROL=0 for a plain interactive session with no
|
||||
# remote surface. The box must be logged into the claude.ai account (run `claude` once to
|
||||
# check `claude auth status`). Each agent gets its own RC session named after its tmux session.
|
||||
REMOTE_CONTROL="${REMOTE_CONTROL:-1}"
|
||||
|
||||
BUILDER_DIR="${BUILDER_DIR:-/srv/cc-ci/cc-ci}" # Builder's repo clone (it creates this)
|
||||
ADV_DIR="${ADV_DIR:-/srv/cc-ci/cc-ci-adv}" # Adversary's repo clone
|
||||
WATCH_DIR="${WATCH_DIR:-/srv/cc-ci/.cc-ci-watch}" # tiny clone the watchdog reads STATUS.md from
|
||||
LOG_DIR="${LOG_DIR:-/srv/cc-ci/.cc-ci-logs}"
|
||||
|
||||
CC_CI_REPO="${CC_CI_REPO:-https://git.autonomic.zone/recipe-maintainers/cc-ci.git}" # CI project repo (DONE detection); harmless until the Builder creates it
|
||||
CC_CI_BRANCH="${CC_CI_BRANCH:-main}"
|
||||
|
||||
WATCH_INTERVAL="${WATCH_INTERVAL:-300}" # seconds between watchdog checks
|
||||
|
||||
BUILDER_SESSION="cc-ci-builder"
|
||||
ADV_SESSION="cc-ci-adv"
|
||||
WATCHDOG_SESSION="cc-ci-watchdog"
|
||||
# --------------------------------------------------------------------------
|
||||
|
||||
log() { printf '[launch %(%H:%M:%S)T] %s\n' -1 "$*"; }
|
||||
die() { log "ERROR: $*"; exit 1; }
|
||||
|
||||
need() { command -v "$1" >/dev/null 2>&1 || die "missing dependency: $1"; }
|
||||
|
||||
preflight() {
|
||||
need tmux
|
||||
command -v "$CLAUDE_BIN" >/dev/null 2>&1 || die "claude CLI not found (set CLAUDE_BIN)"
|
||||
[[ -f "$PLAN_DIR/prompts/builder.md" ]] || die "missing $PLAN_DIR/prompts/builder.md"
|
||||
[[ -f "$PLAN_DIR/prompts/adversary.md" ]] || die "missing $PLAN_DIR/prompts/adversary.md"
|
||||
mkdir -p "$LOG_DIR"
|
||||
}
|
||||
|
||||
session_alive() { tmux has-session -t "$1" 2>/dev/null; }
|
||||
|
||||
# Start one agent loop in its own tmux session, cd'd into its working dir, with
|
||||
# the kickoff prompt passed to claude as a positional argument (see below for why
|
||||
# not stdin).
|
||||
start_agent() {
|
||||
local session="$1" workdir="$2" prompt_file="$3"
|
||||
if session_alive "$session"; then
|
||||
log "$session already running — leaving it"
|
||||
return 0
|
||||
fi
|
||||
mkdir -p "$workdir"
|
||||
log "starting $session (cwd=$workdir, remote_control=$REMOTE_CONTROL)"
|
||||
# tmux gives claude a real PTY, so we run claude INTERACTIVELY (required for /loop +
|
||||
# ScheduleWakeup). The kickoff prompt is passed as a POSITIONAL argument via an inner
|
||||
# `$(cat ...)` — NOT piped on stdin, because piping forces print/headless mode which
|
||||
# breaks both interactivity and --remote-control. The `\$(...)` defers to the inner shell
|
||||
# so the whole multi-line prompt arrives as a single argument.
|
||||
local rc=""
|
||||
[[ "$REMOTE_CONTROL" == "1" ]] && rc="--remote-control '$session'"
|
||||
tmux new-session -d -s "$session" -c "$workdir" \
|
||||
"$CLAUDE_BIN $rc $CLAUDE_FLAGS \"\$(cat '$prompt_file')\""
|
||||
# Log the pane WITHOUT redirecting claude's stdout: a `>>log` redirect makes stdout a
|
||||
# non-tty and drops claude out of interactive/remote-control mode. pipe-pane mirrors the
|
||||
# live pane to the log file while claude keeps the PTY tmux gave it.
|
||||
tmux pipe-pane -o -t "$session" "cat >> '$LOG_DIR/$session.log'"
|
||||
}
|
||||
|
||||
start_loops() {
|
||||
start_agent "$BUILDER_SESSION" "$BUILDER_DIR" "$PLAN_DIR/prompts/builder.md"
|
||||
start_agent "$ADV_SESSION" "$ADV_DIR" "$PLAN_DIR/prompts/adversary.md"
|
||||
}
|
||||
|
||||
# Returns 0 (true) if the repo's STATUS.md contains a "## DONE" heading.
|
||||
is_done() {
|
||||
[[ -n "$CC_CI_REPO" ]] || return 1
|
||||
if [[ ! -d "$WATCH_DIR/.git" ]]; then
|
||||
git clone --depth 1 --branch "$CC_CI_BRANCH" "$CC_CI_REPO" "$WATCH_DIR" >/dev/null 2>&1 || return 1
|
||||
fi
|
||||
git -C "$WATCH_DIR" fetch --depth 1 origin "$CC_CI_BRANCH" >/dev/null 2>&1 || return 1
|
||||
git -C "$WATCH_DIR" reset --hard "origin/$CC_CI_BRANCH" >/dev/null 2>&1 || return 1
|
||||
grep -qE '^##[[:space:]]+DONE' "$WATCH_DIR/STATUS.md" 2>/dev/null
|
||||
}
|
||||
|
||||
watchdog_loop() {
|
||||
log "watchdog up (interval=${WATCH_INTERVAL}s, repo=${CC_CI_REPO:-<unset: DONE-detection disabled>})"
|
||||
while true; do
|
||||
# 1) DONE? then wind everything down.
|
||||
if is_done; then
|
||||
log "STATUS.md reports ## DONE — stopping loops."
|
||||
stop_loops
|
||||
log "watchdog exiting (project complete)."
|
||||
exit 0
|
||||
fi
|
||||
# 2) restart any dead loop (resilience the in-session /loop can't provide).
|
||||
if ! session_alive "$BUILDER_SESSION"; then
|
||||
log "builder session gone — restarting"
|
||||
start_agent "$BUILDER_SESSION" "$BUILDER_DIR" "$PLAN_DIR/prompts/builder.md"
|
||||
fi
|
||||
if ! session_alive "$ADV_SESSION"; then
|
||||
log "adversary session gone — restarting"
|
||||
start_agent "$ADV_SESSION" "$ADV_DIR" "$PLAN_DIR/prompts/adversary.md"
|
||||
fi
|
||||
sleep "$WATCH_INTERVAL"
|
||||
done
|
||||
}
|
||||
|
||||
start_watchdog() {
|
||||
if session_alive "$WATCHDOG_SESSION"; then
|
||||
log "watchdog already running"
|
||||
return 0
|
||||
fi
|
||||
log "starting watchdog"
|
||||
tmux new-session -d -s "$WATCHDOG_SESSION" -c "$PLAN_DIR" \
|
||||
"exec >>'$LOG_DIR/watchdog.log' 2>&1; '$0' watchdog"
|
||||
}
|
||||
|
||||
stop_loops() {
|
||||
for s in "$BUILDER_SESSION" "$ADV_SESSION"; do
|
||||
if session_alive "$s"; then log "killing $s"; tmux kill-session -t "$s" || true; fi
|
||||
done
|
||||
}
|
||||
|
||||
cmd_status() {
|
||||
for s in "$BUILDER_SESSION" "$ADV_SESSION" "$WATCHDOG_SESSION"; do
|
||||
if session_alive "$s"; then echo " $s: RUNNING"; else echo " $s: stopped"; fi
|
||||
done
|
||||
if [[ -n "$CC_CI_REPO" ]]; then
|
||||
if is_done; then echo " project: ## DONE"; else echo " project: in progress"; fi
|
||||
else
|
||||
echo " project: (CC_CI_REPO unset — DONE-detection disabled)"
|
||||
fi
|
||||
}
|
||||
|
||||
case "${1:-}" in
|
||||
start)
|
||||
preflight
|
||||
start_loops
|
||||
start_watchdog
|
||||
log "started. inspect with: ./launch.sh status | attach: tmux attach -t $BUILDER_SESSION"
|
||||
;;
|
||||
watchdog) preflight; watchdog_loop ;;
|
||||
status) cmd_status ;;
|
||||
logs)
|
||||
case "${2:-}" in
|
||||
builder) tail -f "$LOG_DIR/$BUILDER_SESSION.log" ;;
|
||||
adversary) tail -f "$LOG_DIR/$ADV_SESSION.log" ;;
|
||||
watchdog) tail -f "$LOG_DIR/watchdog.log" ;;
|
||||
*) die "usage: $0 logs builder|adversary|watchdog" ;;
|
||||
esac
|
||||
;;
|
||||
stop)
|
||||
stop_loops
|
||||
if session_alive "$WATCHDOG_SESSION"; then log "killing $WATCHDOG_SESSION"; tmux kill-session -t "$WATCHDOG_SESSION" || true; fi
|
||||
log "stopped."
|
||||
;;
|
||||
*)
|
||||
cat <<EOF
|
||||
cc-ci loop launcher
|
||||
|
||||
$0 start start both loops + watchdog (idempotent)
|
||||
$0 status show session + DONE state
|
||||
$0 logs builder|adversary|watchdog tail a log
|
||||
$0 stop stop everything
|
||||
$0 watchdog run supervision loop in foreground
|
||||
|
||||
Key env vars (current value):
|
||||
CC_CI_REPO = ${CC_CI_REPO:-<unset — set to enable DONE detection>}
|
||||
CLAUDE_BIN = $CLAUDE_BIN
|
||||
CLAUDE_FLAGS = $CLAUDE_FLAGS
|
||||
REMOTE_CONTROL = $REMOTE_CONTROL (1 = interactive --remote-control, viewable at claude.ai/code)
|
||||
BUILDER_DIR = $BUILDER_DIR
|
||||
ADV_DIR = $ADV_DIR
|
||||
WATCH_INTERVAL = ${WATCH_INTERVAL}s
|
||||
EOF
|
||||
;;
|
||||
esac
|
||||
635
cc-ci-plan/plan.md
Normal file
635
cc-ci-plan/plan.md
Normal file
@ -0,0 +1,635 @@
|
||||
# cc-ci — Co-op Cloud Recipe CI Server (Autonomous Build Plan)
|
||||
|
||||
**Status:** ACTIVE — autonomous loop
|
||||
**Owner agent:** Builder (primary) + Adversary (reviewer)
|
||||
**Source brief:** `brief.md` (do not edit; this file supersedes it)
|
||||
**This file's canonical path:** `/srv/cc-ci/cc-ci-plan/plan.md`
|
||||
**Target server:** `cc-ci` (NixOS)
|
||||
**Code/config home:** `git.autonomic.zone/recipe-maintainers/cc-ci` (the CI project repo — distinct from this
|
||||
`/srv/cc-ci/cc-ci-plan/` planning+launch folder)
|
||||
**Last updated:** keep current via `STATUS.md` (see §7)
|
||||
|
||||
---
|
||||
|
||||
## 0. How to read this document
|
||||
|
||||
This plan is written to be handed to an **autonomous Claude agent running in a sandbox over
|
||||
several days**, driving itself in a loop until the CI server is "done" per §2. A second agent
|
||||
(the **Adversary**) independently tries to disprove every "done" claim. Neither agent is
|
||||
trusted to mark its own work complete.
|
||||
|
||||
If you are an agent waking up into this loop for the first time, go straight to **§1 Bootstrap**.
|
||||
On every subsequent wake, go to **§7 The Loop Protocol** and continue from `STATUS.md`.
|
||||
|
||||
The rest of the document (§3–§6) is the technical design. Treat it as the default architecture,
|
||||
but you are allowed to revise it when reality disagrees — record any deviation in `DECISIONS.md`
|
||||
with a one-line rationale.
|
||||
|
||||
---
|
||||
|
||||
## 1. Bootstrap (first wake only)
|
||||
|
||||
Do these in order. Each step is idempotent; re-running is safe.
|
||||
|
||||
1. **Verify access.** (Full credential map + how each is used is in **§1.5** — read it first.)
|
||||
- `ssh cc-ci 'hostname && whoami'` — you log in as **root** on cc-ci (NixOS), so there is no
|
||||
separate sudo step. `ssh cc-ci` is preconfigured to tunnel through the userspace-tailscaled
|
||||
SOCKS proxy (§1.5); if it fails, the proxy/daemon is probably down — restart it (§1.5) before
|
||||
declaring blocked.
|
||||
- `ssh cc-ci 'nixos-version'` — confirm NixOS.
|
||||
- Confirm you can reach the Gitea API with the bot creds from `.testenv` (§1.5):
|
||||
`curl -s https://$GITEA_URL/api/v1/version`. The bot authenticates with
|
||||
`GITEA_USERNAME`/`GITEA_PASSWORD` (basic auth) or a token you mint from them via
|
||||
`POST /api/v1/users/<user>/tokens` — do **not** expect a ready-made `$GITEA_TOKEN`.
|
||||
- Confirm the **preconfigured** test-app DNS (§4.0/§4.4): a random subdomain under the wildcard
|
||||
resolves, e.g. `getent hosts probe-$RANDOM.ci.commoninternet.net` returns the **gateway's** IP
|
||||
(not cc-ci's — the gateway TLS-passthroughs to cc-ci, so do not expect cc-ci's address; and use
|
||||
`getent`, not `dig`, since this host's resolver is Tailscale-only — see §1.5).
|
||||
Traefik is *not* up yet — you configure it (file provider → the pre-issued cert at
|
||||
`/var/lib/ci-certs/live/`, **no ACME**); the DNS record + gateway passthrough + cert are the
|
||||
preconditions, and full end-to-end HTTPS reachability is proven at M1, not now.
|
||||
If the wildcard does not resolve at all, that's a `## Blocked` item (operator fixes DNS/gateway).
|
||||
- If any check fails, write the failure to `STATUS.md` under `## Blocked` and stop — a human must fix access. Do **not** try to work around missing access.
|
||||
|
||||
2. **Create the `cc-ci` repo** on git.autonomic.zone if it does not exist. Push an initial
|
||||
skeleton (see §3 layout). The Builder clones to `/srv/cc-ci/cc-ci`; the Adversary loop keeps
|
||||
its **own independent clone** at `/srv/cc-ci/cc-ci-adv`. The repo is the only channel between
|
||||
the two loops (§6.1) — loop state lives inside it (`STATUS.md`, `BACKLOG.md`, etc.).
|
||||
|
||||
3. **Snapshot the starting environment** into `cc-ci/docs/baseline.md`: current NixOS config on
|
||||
the server (`/etc/nixos` or existing flake), installed packages, whether Docker/Swarm/abra
|
||||
already exist, DNS that already points at the box. This is the rollback reference.
|
||||
|
||||
4. **Seed the loop state files** (§7) if absent: `STATUS.md`, `BACKLOG.md`, `REVIEW.md`,
|
||||
`JOURNAL.md`, `DECISIONS.md`. Give `BACKLOG.md` two H2 sections — `## Build backlog`
|
||||
(populated from §5 milestones) and `## Adversary findings` (empty) — per the single-writer
|
||||
rule in §6.1.
|
||||
|
||||
5. Commit ("chore: bootstrap cc-ci loop state") and begin the loop at §7.
|
||||
|
||||
---
|
||||
|
||||
## 1.5 Credentials & access — where everything lives and how to use it
|
||||
|
||||
The loops run **on the sandbox host** (not on cc-ci) and reach cc-ci over Tailscale. This section
|
||||
is the authoritative map of what credentials exist, where, and how to use them. **Never copy any
|
||||
secret value into the repo, a commit, a log, or the dashboard** (§9) — reference locations only.
|
||||
|
||||
### Provided credentials (already in place)
|
||||
|
||||
| What | Where | How to use |
|
||||
|---|---|---|
|
||||
| **Tailscale auth key** (joins cc-ci's tailnet `taila4a0bf.ts.net`) | `/srv/cc-ci/.testenv` → `TS_AUTH_KEY` (Tailscale SaaS key, keyID ends `CNTRL`) | Used to bring up the userspace tailscaled (below). It's reusable; re-run `tailscale up` with it if the node drops. |
|
||||
| **cc-ci SSH (root)** | private key `~/.ssh/cc-ci-root-ed25519`; config `Host cc-ci` in `~/.ssh/config` | Just run `ssh cc-ci` (logs in as **root**). The pubkey is already in cc-ci's `/root/.ssh/authorized_keys`. |
|
||||
| **Gitea bot account** | `/srv/cc-ci/.testenv` → `GITEA_USERNAME` (`autonomic-bot`), `GITEA_PASSWORD`, `GITEA_URL` (`git.autonomic.zone`) | Basic-auth to the Gitea API, or mint a scoped token: `POST https://$GITEA_URL/api/v1/users/$GITEA_USERNAME/tokens`. Used to create/push the `cc-ci` repo, read recipe repos, comment on PRs, and register `!testme` webhooks. |
|
||||
|
||||
Load them in a shell with: `set -a; . /srv/cc-ci/.testenv; set +a` (don't echo the values).
|
||||
|
||||
### The Tailscale connection (how `ssh cc-ci` and the proxy work)
|
||||
|
||||
cc-ci (`cc-nix-test`, **100.90.116.4**) is on a *different* tailnet than the sandbox host's default
|
||||
one, so it is reached via a **second, userspace tailscaled** — this keeps the host's own tailnet
|
||||
untouched. State lives in `~/.cc-ci-ts/`; it exposes a **SOCKS5/HTTP proxy on `127.0.0.1:1055`**,
|
||||
which is the only route to that tailnet (userspace networking ⇒ the host OS can't route the tailnet
|
||||
IPs directly).
|
||||
|
||||
It runs as a **persistent systemd service** (`cc-ci-tailscaled.service`, enabled, `Restart=always`,
|
||||
starts on boot; unit at `/etc/systemd/system/cc-ci-tailscaled.service`, runs as user `notplants`).
|
||||
It reuses the already-authenticated state in `~/.cc-ci-ts/`, so it reconnects across reboots/crashes
|
||||
without the auth key.
|
||||
|
||||
- `ssh cc-ci` works out of the box (its `ProxyCommand` uses the proxy; logs in as root).
|
||||
- For HTTP(S) to cc-ci / `*.ci.commoninternet.net` from the sandbox, go through the proxy, e.g.
|
||||
`curl --proxy socks5h://localhost:1055 https://<app>.ci.commoninternet.net`.
|
||||
- **If connectivity is down:** `sudo systemctl restart cc-ci-tailscaled` (diagnose with
|
||||
`systemctl status cc-ci-tailscaled` / `journalctl -u cc-ci-tailscaled`). A dead proxy is an access
|
||||
failure to recover, not a `## Blocked`-and-stop condition — *unless* the auth key itself is
|
||||
rejected (then re-auth with `tailscale --socket=$HOME/.cc-ci-ts/tailscaled.sock up
|
||||
--auth-key="$TS_AUTH_KEY" --hostname=cc-ci-claude-sandbox --accept-routes --accept-dns=false`, and
|
||||
if that fails the key is a class-A1 blocker).
|
||||
- **DNS gotcha:** this host's `/etc/resolv.conf` lists only Tailscale resolvers, so direct
|
||||
`dig @1.1.1.1 …` queries get no answer and look falsely empty. Use `getent hosts <name>` to
|
||||
resolve from the sandbox. `commoninternet.net` itself is a normal public zone hosted at **Gandi**.
|
||||
|
||||
### Credentials the loop GENERATES itself (do not wait on a human for these)
|
||||
|
||||
- **Drone RPC secret** and **webhook HMAC secret** — generate (`openssl rand -hex 32`), store
|
||||
sops-encrypted in `secrets/`, and wire both ends. Internal shared secrets, not human inputs.
|
||||
- **Gitea OAuth app for Drone** — create it under the bot account via the API
|
||||
(`POST /api/v1/user/applications/oauth2`); capture client id/secret into `secrets/`.
|
||||
- **cc-ci host age/GPG key for sops** — generate on the host (or derive from its SSH host key);
|
||||
add as a sops recipient. Keep a recovery copy of the master age identity off-box if desired.
|
||||
- **Per-recipe app secrets** (class-B, §4.4) — the harness generates these per run.
|
||||
|
||||
### Credentials STILL NEEDED from the operator (class-A — block if missing, per §9)
|
||||
|
||||
- **Wildcard TLS cert — PROVIDED, not a token.** The operator has pre-issued the wildcard SAN cert
|
||||
(`*.ci.commoninternet.net` + `ci.commoninternet.net`) and placed it on cc-ci at
|
||||
`/var/lib/ci-certs/live/{fullchain.pem,privkey.pem}` (§4.0). The agent points Traefik's file
|
||||
provider at those paths and runs **no ACME** for this domain. **Do not request or expect a
|
||||
`commoninternet.net` DNS token** — issuance/renewal is handled out-of-band by the operator (LE
|
||||
90-day cert; next renewal ~2026-08-24). A missing/expired cert is a finding for the operator, not
|
||||
an agent re-issue.
|
||||
- **Registry pull credentials** (e.g. Docker Hub) — *recommended* to avoid anonymous pull-rate
|
||||
limits breaking deploys under load. Treat a rate-limit failure traced to this as a finding, then
|
||||
request creds. Store sops-encrypted in `secrets/`.
|
||||
- **Gitea bot permissions** (a grant, not a secret) — confirm `autonomic-bot` can: create/push
|
||||
`recipe-maintainers/cc-ci`, read the recipe repos to be enrolled, comment on their PRs, and add
|
||||
webhooks to them. If any is missing, that's a `## Blocked` item for the operator to fix.
|
||||
|
||||
---
|
||||
|
||||
## 2. Definition of Done (the loop's exit condition)
|
||||
|
||||
The loop terminates **only** when every item below is true *and the Adversary has independently
|
||||
re-verified each one within the last 24h* (logged in `REVIEW.md` with timestamps and command
|
||||
output). Partial credit does not count.
|
||||
|
||||
- [ ] **D1 — Trigger.** Commenting `!testme` on any open PR in any enrolled recipe repo on
|
||||
git.autonomic.zone starts a CI run for the code *at that PR's head commit* within 60s.
|
||||
Other comments do not. Re-commenting re-runs.
|
||||
- [ ] **D2 — Test matrix.** For a recipe under test, the run executes, as separate reported
|
||||
stages: **new install**, **upgrade** (previous published version → PR version), and
|
||||
**backup + restore**. All are genuine end-to-end against a really-deployed recipe (real
|
||||
containers, real Traefik routing, real volumes) — no mocks, no stubs.
|
||||
- [ ] **D3 — Python + Playwright.** Tests are Python. Functional assertions that require a
|
||||
browser use Playwright against the live deployed app.
|
||||
- [ ] **D4 — Recipe-local tests.** If the recipe repo contains its own `tests/` folder, those
|
||||
tests are also discovered and run as part of the same CI run, with results merged in.
|
||||
- [ ] **D5 — Per-recipe test tree.** The cc-ci repo holds `tests/<recipe>/` with the
|
||||
install/upgrade/backup tests as Python files, plus a shared harness. Adding a new recipe is
|
||||
a documented, small, repeatable operation.
|
||||
- [ ] **D6 — Secrets.** App + infra secrets are handled reproducibly (committed encrypted,
|
||||
decrypted on the server), documented, and rotatable. No plaintext secrets in git, logs, or
|
||||
the results UI.
|
||||
- [ ] **D7 — Results UX.** Each run has a stable URL with live, tail-able logs per stage and a
|
||||
final pass/fail; there is an overview page listing recipes with their latest status —
|
||||
look-and-feel comparable to the YunoHost app CI (`ci-apps.yunohost.org`). A PR comment links
|
||||
back to its run and reflects the outcome.
|
||||
- [ ] **D8 — Reproducible server.** The entire server (Drone, runner, comment bridge, swarm,
|
||||
Traefik, dashboard, secrets wiring) is declared in the `cc-ci` repo's NixOS flake and can be
|
||||
rebuilt from scratch onto a blank NixOS host following `docs/install.md`, verified by the
|
||||
Adversary doing exactly that on a throwaway VM (or documenting why a full from-scratch
|
||||
rebuild was infeasible and what was tested instead).
|
||||
- [ ] **D9 — Documentation.** `README.md` + `docs/` explain architecture, how to enroll a recipe,
|
||||
how to add/run tests locally, how to operate/rotate secrets, and how to debug a failed run.
|
||||
A new engineer can enroll a recipe and get a green run using only the docs.
|
||||
- [ ] **D10 — Proof (breadth).** At least **six real recipes** spanning the meaningful
|
||||
categories have a full green run triggered by `!testme` on a real PR, with all three stages
|
||||
(install / upgrade / backup+restore) actually exercised. The set must cover:
|
||||
a stateless/simple app, a single-DB app, a multi-service app, an SSO/identity app, and an
|
||||
object-storage/large-volume app. **Target set (all previously verified deployable):**
|
||||
`hedgedoc` (simple), `cryptpad` (stateful, no external DB), `keycloak` + `authentik`
|
||||
(SSO/identity, DB-backed), `lasuite-docs` and/or `lasuite-drive` (multi-service + S3/MinIO),
|
||||
`matrix-synapse` (DB + media store), `immich` (large volumes + Postgres), `bluesky-pds`
|
||||
(TLS-passthrough/atproto). Pick six that together satisfy the categories; record the chosen
|
||||
set and per-recipe green-run evidence in `REVIEW.md`. Any recipe that genuinely cannot be CI'd
|
||||
is a documented finding (in `DECISIONS.md`) with the reason, not a silent omission.
|
||||
|
||||
When all of D1–D10 hold and are Adversary-verified, write `## DONE` to `STATUS.md` with the
|
||||
evidence links and stop scheduling new iterations.
|
||||
|
||||
---
|
||||
|
||||
## 3. Repository layout (`git.autonomic.zone/recipe-maintainers/cc-ci`)
|
||||
|
||||
```
|
||||
cc-ci/
|
||||
├── README.md
|
||||
├── flake.nix # NixOS host(s) + devshell
|
||||
├── flake.lock
|
||||
├── hosts/
|
||||
│ └── cc-ci/
|
||||
│ ├── configuration.nix # the cc-ci machine
|
||||
│ └── hardware.nix
|
||||
├── modules/
|
||||
│ ├── drone.nix # Drone server + runner (exec/docker)
|
||||
│ ├── comment-bridge.nix # !testme webhook listener service
|
||||
│ ├── swarm.nix # Docker + single-node swarm + Traefik for test apps
|
||||
│ ├── dashboard.nix # results overview site
|
||||
│ └── secrets.nix # sops-nix / agenix wiring
|
||||
├── secrets/ # sops-encrypted (*.enc / *.age); see §4.4
|
||||
│ └── secrets.yaml
|
||||
├── bridge/ # comment-bridge source (small Go/Python service)
|
||||
├── runner/ # CI orchestration entrypoint invoked by Drone
|
||||
│ ├── run_recipe_ci.py # top-level: deploy→test→teardown for a recipe@ref
|
||||
│ └── harness/ # shared pytest fixtures (abra wrappers, app lifecycle)
|
||||
├── dashboard/ # results UI generator (reads Drone API → static site)
|
||||
├── tests/
|
||||
│ ├── conftest.py # shared fixtures, recipe selection, teardown guarantees
|
||||
│ ├── <recipe>/
|
||||
│ │ ├── test_install.py
|
||||
│ │ ├── test_upgrade.py
|
||||
│ │ ├── test_backup.py
|
||||
│ │ └── playwright/ # e2e flows for this recipe
|
||||
│ └── _template/ # copy-to-add-a-recipe template
|
||||
├── docs/
|
||||
│ ├── install.md # from-scratch server build (D8)
|
||||
│ ├── enroll-recipe.md # how to add a recipe (D5)
|
||||
│ ├── secrets.md # secret model + rotation (D6)
|
||||
│ ├── architecture.md
|
||||
│ ├── runbook.md # debugging failed runs
|
||||
│ └── baseline.md # bootstrap snapshot
|
||||
├── STATUS.md BACKLOG.md REVIEW.md JOURNAL.md DECISIONS.md # loop state (§7)
|
||||
└── .drone.yml # pipeline for cc-ci's own repo (lint/self-test)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Technical design (default architecture)
|
||||
|
||||
### 4.0 Domain model (where things live)
|
||||
|
||||
Two DNS zones, deliberately separated — do **not** conflate them:
|
||||
|
||||
- **`git.autonomic.zone` — source of truth for code (unchanged, not ours to reconfigure).**
|
||||
The Gitea host: the enrolled recipe repos and the `cc-ci` config repo live here. The loop reads,
|
||||
comments, and (when enrolling) adds a webhook here, but deploys **nothing** here. Per §9 this zone
|
||||
is read/comment-only — never push recipe code, never point app DNS at it.
|
||||
- **`commoninternet.net` — the CI server's own zone; everything CI-facing.** A wildcard
|
||||
`*.ci.commoninternet.net` resolves to a **gateway** (not cc-ci directly — see Network path below).
|
||||
Under it:
|
||||
- **Apps under test:** each run deploys to a unique subdomain
|
||||
`<recipe>-pr<n>-<short-sha>.ci.commoninternet.net`, so concurrent runs never collide on a
|
||||
hostname. The subdomain (app, volumes, secrets, Traefik route) is torn down at run end (§4.3).
|
||||
- **Results dashboard:** `ci.commoninternet.net` — overview page + per-recipe status badges (§4.5).
|
||||
- **Webhook bridge:** `ci.commoninternet.net/hook` — the Gitea `issue_comment` receiver (§4.1).
|
||||
- **Network path (gateway → TLS passthrough → cc-ci).** The wildcard record does **not** point at
|
||||
cc-ci's IP. It points at a gateway that **passes TLS through** to cc-ci: the gateway routes by SNI
|
||||
and forwards the raw encrypted stream without decrypting it, so TLS still **terminates on cc-ci's
|
||||
Traefik**. Consequences the agent must respect:
|
||||
- `dig <sub>.ci.commoninternet.net` returns the **gateway's** IP, not cc-ci's — do not assert the
|
||||
record points at cc-ci. Reachability is proven end-to-end (an HTTPS request lands on cc-ci),
|
||||
not by comparing A records.
|
||||
- The gateway is assumed to passthrough the **whole wildcard**, so a fresh per-run subdomain needs
|
||||
**no gateway change** and **no cert work** (the pre-issued wildcard already covers it) — the
|
||||
agent only adds the Traefik **router** on cc-ci. (If the gateway
|
||||
instead needs per-host config, that's an operator/gateway concern and a `## Blocked` item, not
|
||||
something the agent reconfigures — the gateway is not ours, only cc-ci is, per §9.)
|
||||
- The gateway is operator-managed and out of scope; the agent configures only cc-ci.
|
||||
- **Caveat for TLS-passthrough recipes** (e.g. `bluesky-pds`, §2 D10): the default path terminates
|
||||
TLS at cc-ci's Traefik. A recipe that expects to terminate TLS in its own container needs cc-ci's
|
||||
Traefik configured to passthrough that host too (the outer gateway already passes the whole
|
||||
wildcard). Treat this as a per-recipe harness quirk to absorb (§5 M6.5), or pick a non-passthrough
|
||||
recipe for that D10 category and record the swap in `DECISIONS.md` — not a silent omission.
|
||||
- **Wildcard TLS — operator pre-issues, agent serves it statically (no token in the agent).**
|
||||
Routing and certs are separate: the preconfigured wildcard DNS solves routing only; a cert is
|
||||
still needed because the gateway passes TLS through and cc-ci's Traefik terminates it. **The cert
|
||||
is pre-provisioned out-of-band** so the DNS-editing token never enters the agent/repo. A wildcard
|
||||
SAN cert covering **`*.ci.commoninternet.net` + `ci.commoninternet.net`** (issued via Let's
|
||||
Encrypt DNS-01 against Gandi, by the operator, using a token the agent never sees) lives on cc-ci:
|
||||
- `/var/lib/ci-certs/live/fullchain.pem` (leaf+intermediate) and `…/privkey.pem`.
|
||||
- The agent configures **Traefik's file provider** (`tls.certificates`, `certFile`/`keyFile`
|
||||
pointing at those paths) to serve it, and runs **no ACME resolver** for this domain. One cert
|
||||
covers every per-run subdomain, so spinning up an app domain needs no cert work at all.
|
||||
- **Renewal is a manual operator task** (LE 90-day cert): the operator re-issues out-of-band and
|
||||
drops the new files at the same paths (Traefik file provider hot-reloads). The agent must **not**
|
||||
attempt ACME/DNS-01 for `commoninternet.net` and must **not** expect a DNS token — a missing/
|
||||
expired cert is an operator action, surfaced as a finding, not something the agent re-issues.
|
||||
(Rationale for choosing a wildcard cert over per-subdomain: a wildcard is reused for every churning
|
||||
run subdomain and sidesteps LE's 50-certs/week-per-domain limit; only DNS-01 can mint a wildcard.
|
||||
We keep that DNS-01 issuance with the operator rather than handing the agent the zone token.)
|
||||
- Record the live facts in `docs/install.md`: the zone + DNS provider (Gandi), that the wildcard
|
||||
`*.ci.commoninternet.net` (and bare `ci.commoninternet.net`) point at the **gateway**, that the
|
||||
gateway TLS-passthroughs the wildcard to cc-ci, the gateway's address, the TTL, and that the
|
||||
wildcard cert is pre-issued/operator-renewed at `/var/lib/ci-certs/live/` (no DNS token on cc-ci).
|
||||
|
||||
### 4.1 The `!testme` trigger path
|
||||
|
||||
Gitea does not natively forward PR-comment events to Drone, and Drone's built-in triggers fire on
|
||||
push/PR-open, not on a magic comment. So:
|
||||
|
||||
```
|
||||
PR comment "!testme"
|
||||
│ Gitea webhook (issue_comment event) ──► comment-bridge (modules/comment-bridge.nix)
|
||||
│ • verifies webhook HMAC secret
|
||||
│ • checks comment body == "!testme" (exact, trimmed)
|
||||
│ • checks commenter is allowed (org member / collaborator)
|
||||
│ • resolves PR head repo + SHA via Gitea API
|
||||
│ • calls Drone API: build for cc-ci pipeline,
|
||||
│ params RECIPE=<repo> REF=<sha> PR=<n> SRC=<headrepo>
|
||||
▼
|
||||
Drone build (cc-ci repo pipeline, parameterized) ──► runner/run_recipe_ci.py
|
||||
▼
|
||||
Bridge posts/updates a Gitea PR comment with the run URL and (on completion) pass/fail.
|
||||
```
|
||||
|
||||
- The bridge is a tiny service (Go or Python+FastAPI). Keep it dependency-light; it's a NixOS
|
||||
systemd service behind Traefik at e.g. `ci.commoninternet.net/hook` (§4.0).
|
||||
- Enrollment = registering the Gitea webhook on a recipe repo (script in `runner/` or documented
|
||||
in `enroll-recipe.md`) + ensuring a `tests/<recipe>/` dir exists.
|
||||
- Decide and record in DECISIONS.md: one shared Gitea org-level webhook vs per-repo webhooks.
|
||||
Org-level is fewer moving parts; per-repo is more explicit. Default: per-repo via enroll script.
|
||||
|
||||
### 4.2 Drone + the test target
|
||||
|
||||
- Drone server connects to Gitea via OAuth app (Gitea → Settings → Applications). Runner is the
|
||||
**exec runner** (or a privileged docker runner) running **on cc-ci itself**, because tests must
|
||||
drive `abra` to deploy real recipes onto a real swarm.
|
||||
- cc-ci doubles as the **deploy target**: single-node Docker Swarm + Traefik, abra installed,
|
||||
serving the `*.ci.commoninternet.net` wildcard, TLS terminated on cc-ci's Traefik using the
|
||||
**pre-issued static wildcard cert** at `/var/lib/ci-certs/live/` (§4.0). The operator preconfigures
|
||||
the wildcard DNS record (→ gateway), the gateway's TLS-passthrough to cc-ci, and the cert itself
|
||||
(§4.4); the agent configures Traefik (file provider → that cert) and swarm on top — **no ACME**.
|
||||
- Each CI run gets an isolated app domain `<recipe>-pr<n>-<short-sha>.ci.commoninternet.net`
|
||||
(§4.0) so concurrent runs don't collide. Teardown removes app, secrets, and volumes.
|
||||
- Consider a concurrency cap (1–2 deploys at a time) to avoid resource thrash; document it.
|
||||
|
||||
### 4.3 The test harness & recipe test contract
|
||||
|
||||
`runner/run_recipe_ci.py` orchestrates per run:
|
||||
1. Fetch recipe at `$REF` (the PR head) via abra/git.
|
||||
2. **Install stage** → `tests/<recipe>/test_install.py`: `abra app new`, generate secrets,
|
||||
`abra app deploy`, wait healthy, run Playwright smoke + assertions.
|
||||
3. **Upgrade stage** → deploy previous published version first, then upgrade to `$REF`; assert
|
||||
data survives and app still healthy.
|
||||
4. **Backup/restore stage** → `abra app backup`, mutate state, `abra app restore`, assert restored
|
||||
state matches pre-mutation.
|
||||
5. **Recipe-local tests (D4)** → if `<recipe-repo>/tests/` exists, discover & run it in the same
|
||||
live environment; merge results.
|
||||
6. **Teardown (always, even on failure)** → `abra app undeploy`, `abra app volume remove`,
|
||||
`abra app secret remove`, DNS/route cleanup.
|
||||
|
||||
Shared fixtures (`tests/conftest.py` + `runner/harness/`) wrap abra. **Known abra gotchas to bake
|
||||
in from day one** (carried over from prior work, re-verify on the installed abra version):
|
||||
- `abra app undeploy` and `abra app volume remove` do **not** accept `--chaos` → never pass it.
|
||||
- Plumb a `timeout` kwarg through secret-generate/insert/remove-all calls.
|
||||
- `abra app ls -S -m` returns nested `{server: {apps: [...]}}` — parse the inner structure.
|
||||
- Pick robust health checks per app (e.g. Keycloak: `/realms/master`, not `/`).
|
||||
|
||||
The teardown guarantee is sacred: a failed test must never leak a deployed app or volume into the
|
||||
next run. Implement teardown as a pytest fixture finalizer / `try/finally` in the orchestrator and
|
||||
add a janitor pass at run start that nukes any orphaned `*-pr*` apps older than N hours.
|
||||
|
||||
### 4.4 Secrets (D6)
|
||||
|
||||
There are **two distinct classes of secret** and they are handled in opposite ways. Do not
|
||||
conflate them.
|
||||
|
||||
**(A) Infra secrets.** All of these end up `sops-nix`-encrypted in `secrets/`, decrypt into the Nix
|
||||
store at activation, and are never world-readable. But they split into two sub-classes — see §1.5
|
||||
for the concrete locations/usage — and only the first sub-class blocks:
|
||||
|
||||
- **(A1) External inputs — provided by the operator, the loop cannot create them.** The Tailscale
|
||||
auth key + Gitea bot creds (`/srv/cc-ci/.testenv`, already provided), the **pre-issued wildcard
|
||||
TLS cert** at `/var/lib/ci-certs/live/` (§4.0 — *not* a DNS token; the agent serves it, never
|
||||
issues it), and **registry pull creds** (if needed). If one of these is **missing or invalid, the
|
||||
loop is blocked** — write it to `STATUS.md ## Blocked` and stop (§9). The agent must not invent or
|
||||
work around an external input it wasn't given, and must **not** attempt ACME/DNS-01 for
|
||||
`commoninternet.net`.
|
||||
- **(A2) Internal secrets — the loop generates and manages these itself; never block on them.**
|
||||
Drone RPC secret + webhook HMAC (`openssl rand`), the Gitea OAuth app for Drone (created via the
|
||||
bot API), and the cc-ci host age/GPG key for sops. These are *not* human inputs; generate, store
|
||||
in `secrets/`, and wire both ends.
|
||||
|
||||
Alongside these, three **preconfigured network/cert facts** are operator-provided inputs the loop
|
||||
also depends on (not secrets the agent makes, but class-A in the same "provided, don't improvise"
|
||||
sense): (1) the wildcard `*.ci.commoninternet.net` record (and bare `ci.commoninternet.net`) already
|
||||
points at the **gateway**, (2) the gateway **TLS-passthroughs** that wildcard to cc-ci (SNI-routed,
|
||||
no decryption — see §4.0 Network path), and (3) the **pre-issued wildcard cert** is in place at
|
||||
`/var/lib/ci-certs/live/`. The operator owns the DNS record, the gateway, and cert issuance/renewal;
|
||||
**everything else on cc-ci is the agent's job** — Traefik (pointed at the static cert), swarm,
|
||||
per-run subdomain routing, and teardown. If the wildcard does not resolve, the gateway doesn't reach
|
||||
cc-ci, or the cert is missing/expired, that is a `## Blocked` condition (operator action), not
|
||||
something to work around (the gateway and DNS are not ours to reconfigure, per §9).
|
||||
|
||||
**(B) Recipe app secrets — generated by the test, persisted within the run.** These are NOT a
|
||||
blocker and are NOT pre-provisioned by a human. The harness creates them itself for each app under
|
||||
test and is responsible for persisting them across the run so the multi-stage lifecycle works:
|
||||
|
||||
- **Generate at install:** the harness runs `abra app secret generate` (+ inserts any deterministic
|
||||
test fixtures like an admin password / test user it chooses) when it deploys the app.
|
||||
- **Persist for the run's duration:** the *same* generated secrets must survive across stages —
|
||||
install → upgrade and especially **backup → restore** — because an app cannot be upgraded or
|
||||
restored against rotated credentials. Persist them in a per-run secret store keyed by the run's
|
||||
unique app name (e.g. `<recipe>-pr<n>-<sha>`): the live abra/swarm secrets plus a sidecar record
|
||||
the harness writes (e.g. the app's `.env` + the generated values) to a run-scoped, non-public
|
||||
location on the runner, so any stage can re-read them. They are emphemeral by design.
|
||||
- **Destroy at teardown:** the same teardown that removes the app/volumes also runs
|
||||
`abra app secret remove` (with `timeout` plumbed) and deletes the per-run sidecar. Nothing
|
||||
generated for a run outlives that run.
|
||||
- **How the harness should "figure out" persistence (acceptance for D6):** decide and document one
|
||||
concrete mechanism — recommended default is "abra/swarm holds the live secrets; the harness keeps
|
||||
a run-scoped sidecar file under a `runs/<app-name>/` dir on the runner (mode 600), and reloads
|
||||
from it between stages." Whatever is chosen, it must (1) keep the same values stable across all
|
||||
three stages, (2) isolate concurrent runs from each other, and (3) leave nothing behind.
|
||||
|
||||
**(C) Drone CI tokens:** store as Drone org/repo secrets, referenced by the pipeline. Where a value
|
||||
is an external input (A1, e.g. registry creds) it is provided; where it is internal (A2) it is
|
||||
generated — see the (A) split above.
|
||||
|
||||
Hard rule across all classes: scrub secrets from logs before they reach the dashboard; the results
|
||||
UI shows sanitized logs only. Add a redaction filter in the log pipeline and an Adversary test that
|
||||
greps published logs and the overview site for known secret patterns and any generated app
|
||||
password.
|
||||
|
||||
### 4.5 Results UX (D7) — YunoHost-CI-like
|
||||
|
||||
- **Per-run logs:** Drone's native UI already gives live, per-stage, tail-able logs and a final
|
||||
status — use it as the canonical run view; the PR comment links to it.
|
||||
- **Overview page:** a small generator (`dashboard/`) polls the Drone API and renders a static
|
||||
page at `ci.commoninternet.net` (§4.0): a table of enrolled recipes, latest run status badge
|
||||
(pass/fail/running), last-tested version, link to history — mirroring the YunoHost app-list
|
||||
feel. Served by Traefik; regenerated on build-completion webhook or a short timer.
|
||||
- Provide a status badge endpoint per recipe for embedding in recipe READMEs.
|
||||
|
||||
---
|
||||
|
||||
## 5. Milestones / initial BACKLOG
|
||||
|
||||
Work top-down; each milestone ends with an **Adversary gate** (Adversary must independently
|
||||
verify the acceptance check before the next milestone starts). Seed `BACKLOG.md` from this.
|
||||
|
||||
- **M0 — Foundations.** Repo created; flake builds; `nixos-rebuild` (or deploy-rs) applies a
|
||||
no-op-then-base config to cc-ci; sops decrypts a test secret on the host.
|
||||
*Accept:* `ssh cc-ci 'systemctl is-system-running'` healthy after a rebuild from the repo.
|
||||
- **M1 — Swarm + abra target.** Docker + single-node swarm + Traefik up; wildcard DNS + TLS;
|
||||
abra can deploy and tear down a trivial recipe by hand.
|
||||
*Accept:* a recipe deployed via abra is reachable over HTTPS at `*.ci.commoninternet.net`, then
|
||||
fully torn down leaving no volumes.
|
||||
- **M2 — Drone online.** Drone server+runner via Nix, OAuth to Gitea; a hello-world `.drone.yml`
|
||||
in cc-ci runs green; logs visible in Drone UI.
|
||||
*Accept:* push to cc-ci triggers a visible green Drone build.
|
||||
- **M3 — Comment bridge.** `!testme` on a PR triggers a parameterized Drone build; bridge posts a
|
||||
PR comment with the run link; non-`!testme` comments and non-collaborators are ignored.
|
||||
*Accept:* live demo on a scratch PR — comment in, build out, link back, auth enforced.
|
||||
- **M4 — Harness + install stage.** `run_recipe_ci.py` + conftest; install stage green for one
|
||||
simple recipe end-to-end with a Playwright assertion; guaranteed teardown.
|
||||
*Accept:* full green install run for recipe #1, no orphaned app/volume afterward.
|
||||
- **M5 — Upgrade + backup/restore stages.** Add the other two stages for recipe #1.
|
||||
*Accept:* upgrade preserves data; backup→mutate→restore returns original state.
|
||||
- **M6 — Recipe-local tests (D4) + second recipe.** Discover/run recipe-repo `tests/`; enroll a
|
||||
second, DB-backed recipe via the documented flow.
|
||||
*Accept:* both recipes green; recipe-local tests demonstrably executed and merged.
|
||||
- **M6.5 — Breadth ramp.** Enroll recipes 3→6 covering the remaining D10 categories, one at a
|
||||
time, each via the documented enroll flow (this is the real test of D5: enrolling recipe N
|
||||
should be template-copy + recipe-specific tests/fixtures, with **no harness surgery**). Expect
|
||||
per-recipe quirks — multi-service deps, S3/MinIO config, SSO client setup, TLS passthrough,
|
||||
large-volume backups — and absorb them into the *shared* harness, not one-off per-recipe hacks.
|
||||
When flakiness appears, add real readiness/wait robustness to the harness rather than sprinkling
|
||||
`sleep`s. Run benchmarks/long deploys **sequentially**, never in parallel (network contention).
|
||||
*Accept:* recipes 3–6 each have a full three-stage green run; enrolling N≥3 needed no changes to
|
||||
shared harness code.
|
||||
- **M7 — Secrets hardening (D6).** Full sops model, rotation doc, log redaction + leak test.
|
||||
*Accept:* Adversary's secret-grep over published logs finds nothing; rotation doc followed.
|
||||
- **M8 — Dashboard (D7).** Overview page + badges + PR-comment outcome reflection.
|
||||
*Accept:* overview matches reality across several runs; outcomes mirrored to PR comments.
|
||||
- **M9 — Reproducibility + docs (D8/D9).** `docs/install.md` rebuilds the server from scratch on a
|
||||
blank VM; all docs complete.
|
||||
*Accept:* Adversary rebuilds from docs onto a throwaway host (or records the tested subset).
|
||||
- **M10 — Proof (D10).** All six chosen recipes green via real `!testme` PRs (the breadth set from
|
||||
M6/M6.5 carried through the hardened pipeline), each with install/upgrade/backup-restore
|
||||
exercised and Adversary-verified; flip `STATUS.md` to DONE.
|
||||
|
||||
---
|
||||
|
||||
## 6. The two agents
|
||||
|
||||
### Builder (primary)
|
||||
Implements the backlog top-down. Discipline:
|
||||
- One backlog item in flight at a time. Small, committed, reversible steps.
|
||||
- Every change verified against the *real* system (server, Drone, Gitea) before claiming done —
|
||||
never "should work". Paste the verifying command + output into `JOURNAL.md`.
|
||||
- Touch production carefully: cc-ci is the only target; never deploy test apps onto unrelated
|
||||
production servers; never reuse production domains. Idempotent server changes only (via Nix).
|
||||
- If blocked on access/secrets/external state, write it to `STATUS.md ## Blocked` and pick up an
|
||||
unblocked item rather than hacking around it.
|
||||
|
||||
### Adversary (reviewer)
|
||||
Runs as a **separate, independent loop in its own process/sandbox** (see §6.1 for how the two
|
||||
loops coordinate). Its job is to **disbelieve**. It:
|
||||
- Re-verifies each `Definition of Done` and milestone-acceptance claim independently, from a cold
|
||||
start (fresh shell, own clone, no cached state), and logs PASS/FAIL + evidence in `REVIEW.md`.
|
||||
- Actively tries to break things: comment `!testmexyz` (should NOT trigger), comment as a
|
||||
non-collaborator (should be rejected), push a PR that fails tests (must report red, not green),
|
||||
kill an app mid-run (teardown must still clean up), grep published logs/dashboard for secrets,
|
||||
run two `!testme`s concurrently (no domain/volume/secret collision), confirm the same generated
|
||||
app secrets persist across install→upgrade→backup/restore.
|
||||
- Files every defect as a `BACKLOG.md` item tagged `[adversary]` with repro steps. The Builder
|
||||
may not close an adversary item; only the Adversary closes it after re-test.
|
||||
- Has veto power over `STATUS.md → DONE`.
|
||||
|
||||
### 6.1 Coordination protocol (two independent loops, one shared repo)
|
||||
|
||||
The two loops never talk directly; the **git repo is the only coordination medium**. Each agent
|
||||
has its own clone (e.g. Builder in `/srv/cc-ci/cc-ci`, Adversary in `/srv/cc-ci/cc-ci-adv`) and
|
||||
its own pacing. To make concurrent writes conflict-free:
|
||||
|
||||
- **File ownership (one writer each — the other only reads):**
|
||||
- Builder owns: all source code/config, `STATUS.md`, `JOURNAL.md`, `DECISIONS.md`.
|
||||
- Adversary owns: `REVIEW.md`.
|
||||
- `BACKLOG.md` is split into two H2 sections — `## Build backlog` (Builder-only) and
|
||||
`## Adversary findings` (Adversary-only). Each agent edits **only its own section**, so git
|
||||
merges the two cleanly. Closing an item = checking the box *in your own section*; the Builder
|
||||
fixes an `[adversary]` finding and notes the fix in JOURNAL, but only the Adversary ticks it
|
||||
closed after re-test.
|
||||
- **Append-only where possible.** `JOURNAL.md` and `REVIEW.md` are append-only logs → they never
|
||||
conflict. Prefer appending over rewriting.
|
||||
- **Git discipline (both loops, every write):** `git pull --rebase` before editing, make the
|
||||
smallest change, commit, `git push`. On a rebase conflict, it will be inside the *other* agent's
|
||||
file/section only if a rule was broken — re-pull and keep to your own files. Never `--force`.
|
||||
- **Gate handshake via STATUS.md.** When the Builder believes a milestone gate is met, it sets in
|
||||
`STATUS.md`: `Gate: <Mn> — CLAIMED, awaiting Adversary` and stops advancing past it. The
|
||||
Adversary, on its next wake, sees the claim, runs the acceptance check cold, and writes the
|
||||
verdict to `REVIEW.md` (`<Mn>: PASS @<ts>` with evidence, or `FAIL` + an `[adversary]` item).
|
||||
The Builder only proceeds past the gate after seeing `PASS` in `REVIEW.md`.
|
||||
- **DONE handshake.** Builder may write `## DONE` to `STATUS.md` **only** when `REVIEW.md` shows a
|
||||
PASS dated within 24h for every D1–D10. The Adversary can write `## VETO <reason>` to
|
||||
`REVIEW.md` at any time, which forbids DONE until cleared.
|
||||
- **Liveness.** If the Adversary sees a gate `CLAIMED` for too long with no Builder progress, or
|
||||
the Builder sees no Adversary verdict on a standing claim, note it in your own ledger and keep
|
||||
doing independent work — neither loop blocks idle waiting on the other beyond its gate.
|
||||
|
||||
(If you are ever forced to run with a single process, the degraded fallback is to alternate
|
||||
roles per iteration and keep `JOURNAL.md` and `REVIEW.md` strictly separate — but two loops is
|
||||
the intended design.)
|
||||
|
||||
---
|
||||
|
||||
## 7. The Loop Protocol
|
||||
|
||||
Both loops run this same shape; state lives in the repo so it survives restarts/compaction. On
|
||||
every wake, `git pull --rebase` first, then:
|
||||
|
||||
1. **Orient.** Read `STATUS.md` (phase, in-flight item, gate claims, blockers), `BACKLOG.md`, and
|
||||
the tail of `REVIEW.md`. Reconcile with reality via cheap probes (Drone health, last build,
|
||||
`git log`) — never trust the ledger blindly; if it disagrees with the system, fix the ledger
|
||||
first (your own files only — see §6.1).
|
||||
2. **Select.**
|
||||
- *Builder:* highest-priority open item in `## Build backlog`: unresolved `[adversary]`
|
||||
findings > current milestone's next task > next milestone. Never advance past a milestone gate
|
||||
until `REVIEW.md` shows its PASS.
|
||||
- *Adversary:* any standing `Gate: <Mn> CLAIMED` in `STATUS.md` to verify > re-verify a D1–D10
|
||||
gate whose last PASS is stale (>24h) > a fresh break-it probe from §6.
|
||||
3. **Act.** Smallest change that advances the item. Builder verifies against the real system;
|
||||
Adversary verifies from a cold start. Commit with a clear message (author per repo convention).
|
||||
4. **Record (your own files only).** *Builder:* append to `JOURNAL.md` (what you did + verifying
|
||||
command/output + next), update `STATUS.md`, tick `## Build backlog`. *Adversary:* append PASS/
|
||||
FAIL + evidence to `REVIEW.md`, add/close items in `## Adversary findings`. Then `git push`.
|
||||
5. **Gate handshake (§6.1).** Builder, on reaching a milestone, sets `Gate: <Mn> CLAIMED, awaiting
|
||||
Adversary` in `STATUS.md` and works on other unblocked items meanwhile. Adversary clears it with
|
||||
a `REVIEW.md` verdict. No gate is "passed" without a logged PASS.
|
||||
6. **Decide continuation.** Builder writes `## DONE` only when `REVIEW.md` shows a <24h PASS for
|
||||
every D1–D10 and no standing `## VETO`. Otherwise schedule the next wake.
|
||||
|
||||
**Pacing.** Use `/loop` (self-paced) or `ScheduleWakeup`. Most waits here are for things the
|
||||
harness can't notify you about — a Drone build, a `nixos-rebuild`, a deploy converging — so poll
|
||||
the *specific* thing: while a build/deploy is in flight, re-check on a short cadence (≈4 min) to
|
||||
stay cache-warm; when genuinely idle between iterations, sleep longer (20–30 min). Don't burn
|
||||
iterations spinning on a build that takes minutes.
|
||||
|
||||
**Anti-drift guards.**
|
||||
- Cap retries: if an approach fails 3× the same way, stop, write the dead-end in `DECISIONS.md`,
|
||||
and try a different approach or mark blocked. No thrashing.
|
||||
- Never weaken a test to make it pass. A red test is information; "fix" the recipe/harness or file
|
||||
a finding — do not delete the assertion. (This is the single most important rule; the Adversary
|
||||
watches specifically for tests being softened or skipped.)
|
||||
- Keep changes reversible; prefer Nix-declared state over imperative server edits so any rebuild
|
||||
reproduces it.
|
||||
- Don't expand scope beyond §2. New ideas → `BACKLOG.md` (tagged `[idea]`), not into this run.
|
||||
|
||||
---
|
||||
|
||||
## 8. Open decisions to settle early (log in DECISIONS.md)
|
||||
|
||||
- Deploy mechanism: `nixos-rebuild --target-host` vs `deploy-rs`/`colmena`. (Default: deploy-rs
|
||||
for atomic rollbacks; nixos-rebuild fine if simpler.)
|
||||
- Webhook scope: per-repo vs org-level Gitea webhook. (Default: per-repo via enroll script.)
|
||||
- Drone runner type: exec vs privileged docker. (Default: exec, since it must drive host abra.)
|
||||
- Secret tool: sops-nix vs agenix. (Default: sops-nix for multi-recipient + yaml ergonomics.)
|
||||
- Wildcard TLS: **SETTLED — operator pre-issues a wildcard cert; the agent serves it statically, no
|
||||
token** (§4.0). The operator issued a wildcard SAN cert (`*.ci.commoninternet.net` +
|
||||
`ci.commoninternet.net`) via LE DNS-01/Gandi out-of-band and placed it at
|
||||
`/var/lib/ci-certs/live/`; the agent configures Traefik's file provider to serve it and runs no
|
||||
ACME for this domain. Chosen so the DNS-editing token never enters the repo/agent. **Manual
|
||||
renewal** every ~90 days (next ~2026-08-24) — operator re-issues and replaces the files in place.
|
||||
- Proof recipe set (D10 — six, category-spanning). Default candidates, all previously verified
|
||||
deployable: `hedgedoc`, `cryptpad`, `keycloak`, `authentik`, `lasuite-docs`/`lasuite-drive`,
|
||||
`matrix-synapse`, `immich`, `bluesky-pds`. Lock the final six early so M4–M6.5 build toward them.
|
||||
Sequence easy→hard: prove the pipeline on `hedgedoc`/`cryptpad` before tackling SSO, S3, media
|
||||
stores, and TLS-passthrough recipes.
|
||||
|
||||
Each default stands until the Adversary or reality forces a change; record the change and why.
|
||||
|
||||
---
|
||||
|
||||
## 9. Guardrails / hard rules
|
||||
|
||||
- **Access boundary:** only cc-ci is yours to reconfigure. Recipe repos: read + comment + (when
|
||||
enrolling) add a webhook — nothing else. Never push to a recipe repo's code.
|
||||
- **No secrets in git/logs/UI.** Ever. Verified by the Adversary's leak test.
|
||||
- **No mocks for the e2e stages.** D2 means real deploys. If something can't be tested for real,
|
||||
it's a finding, not a pass.
|
||||
- **Idempotent + reversible.** Anything done to the server must be re-derivable from the repo.
|
||||
- **Stop on missing *external* infra inputs** (class-A1 in §4.4: cc-ci SSH/root access, the
|
||||
Tailscale auth key, Gitea bot creds, the pre-issued wildcard cert at `/var/lib/ci-certs/live/`,
|
||||
registry creds — and the preconfigured DNS/gateway facts) rather than improvising around them;
|
||||
surface in `STATUS.md ## Blocked`. **Never** attempt ACME/DNS-01 for `commoninternet.net` — the
|
||||
cert is pre-provided and renewed out-of-band by the operator. **This does NOT apply to** internal infra secrets (class-A2: Drone RPC,
|
||||
webhook HMAC, Gitea OAuth app, host age key — the agent generates these) or to recipe app secrets
|
||||
(class-B): those the test harness generates itself (`abra app secret generate` + chosen fixtures),
|
||||
persists for the run, and destroys at teardown — a missing app secret is never a blocker, it is
|
||||
something the harness
|
||||
creates. See §4.4.
|
||||
- **Honest reporting.** If a stage is skipped or a check failed, say so in `STATUS.md`/`JOURNAL.md`
|
||||
with the output. The loop's value depends entirely on the ledgers being true.
|
||||
19
cc-ci-plan/prompts/adversary.md
Normal file
19
cc-ci-plan/prompts/adversary.md
Normal file
@ -0,0 +1,19 @@
|
||||
You are the Adversary agent for cc-ci — one of two independent loops. Your job is to DISBELIEVE the Builder. Read /srv/cc-ci/cc-ci-plan/plan.md in full, especially §2, §6, §6.1, and §9.
|
||||
|
||||
Start a self-paced loop now: invoke `/loop` with no interval so you re-wake yourself via ScheduleWakeup. Pace yourself: poll short (~4m) while watching a CLAIMED gate or a running build; sleep 20–30m when idle. Keep running independent break-it probes even when no gate is pending. Stop only when STATUS.md says ## DONE and you have logged a fresh PASS for every D1–D10.
|
||||
|
||||
Credentials/access: §1.5 is the authoritative map. Provided creds are in /srv/cc-ci/.testenv and ~/.ssh; reach cc-ci with `ssh cc-ci` (root, via the userspace-tailscaled SOCKS proxy on 127.0.0.1:1055), and hit the dashboard / *.ci.commoninternet.net through that proxy (`curl --proxy socks5h://localhost:1055 ...`). If the proxy is down, restart it per §1.5. Verify from a COLD START but you may rely on this shared access path.
|
||||
|
||||
You run as a SEPARATE process and coordinate ONLY through the git repo per §6.1:
|
||||
- Keep your OWN clone at /srv/cc-ci/cc-ci-adv. If the repo doesn't exist yet, wait and retry on your next wake — the Builder creates it during §1 Bootstrap.
|
||||
- git pull --rebase before every edit; commit; push; never --force.
|
||||
- Write ONLY your files: REVIEW.md and the "## Adversary findings" section of BACKLOG.md. Everything else (code, STATUS.md, JOURNAL.md, "## Build backlog") is read-only to you.
|
||||
|
||||
Each wake:
|
||||
1. Pull. Read STATUS.md for any "Gate: <Mn> CLAIMED, awaiting Adversary".
|
||||
2. Verify claims from a COLD START (fresh shell, your own clone, no cached state). Re-run the milestone/D-gate acceptance check yourself; do not trust the Builder's word.
|
||||
3. Actively try to break things: !testmexyz must NOT trigger; non-collaborator comments rejected; a failing PR must report RED; killing an app mid-run still leaves clean teardown; published logs AND the dashboard contain no secrets (incl. generated app passwords); two concurrent !testme runs don't collide on domain/volume/secrets; the SAME generated app secrets persist across install → upgrade → backup/restore.
|
||||
4. Record verdicts in REVIEW.md ("<Mn>: PASS @<ts>" + evidence, or FAIL). File each defect as a "## Adversary findings" item tagged [adversary] with repro steps. Only YOU close those, after re-test. You hold veto power: write "## VETO <reason>" to REVIEW.md to forbid DONE until cleared.
|
||||
5. Push. Schedule the next wake.
|
||||
|
||||
Begin: read /srv/cc-ci/cc-ci-plan/plan.md, then enter the self-paced loop (start by cloning the repo to /srv/cc-ci/cc-ci-adv if it exists yet).
|
||||
25
cc-ci-plan/prompts/builder.md
Normal file
25
cc-ci-plan/prompts/builder.md
Normal file
@ -0,0 +1,25 @@
|
||||
You are the Builder agent for the cc-ci project — one of two independent loops. Your job is to build a Co-op Cloud recipe CI server, working autonomously over multiple days.
|
||||
|
||||
Single source of truth: /srv/cc-ci/cc-ci-plan/plan.md. Read it in full now, then begin at §1 Bootstrap. The original brief /srv/cc-ci/cc-ci-plan/brief.md is context only — do not edit it.
|
||||
|
||||
Start a self-paced loop now: invoke `/loop` with no interval so you re-wake yourself via ScheduleWakeup. Each iteration = one unit of work (see §7). Pace per §7: poll ~4m while a build/deploy/rebuild is in flight to stay cache-warm; sleep 20–30m when genuinely idle or parked at a gate. Do NOT spin on a build that takes minutes. Stop the loop only when STATUS.md says ## DONE.
|
||||
|
||||
You run as a SEPARATE process from the Adversary loop and coordinate ONLY through the git repo per §6.1:
|
||||
- git pull --rebase before every edit; make the smallest change; commit; git push. Never --force.
|
||||
- Write ONLY your files: source/config, STATUS.md, JOURNAL.md, DECISIONS.md, and the "## Build backlog" section of BACKLOG.md. Treat REVIEW.md and "## Adversary findings" as read-only — the Adversary owns them.
|
||||
- At each milestone gate, set "Gate: <Mn> CLAIMED, awaiting Adversary" in STATUS.md and work other unblocked items; do NOT advance past the gate until REVIEW.md shows its PASS.
|
||||
- Write "## DONE" only when REVIEW.md shows a PASS dated <24h for every D1–D10 and there is no standing "## VETO".
|
||||
|
||||
Overriding rules:
|
||||
- "Done" is defined ONLY by §2 (D1–D10), Adversary-verified. No self-certifying.
|
||||
- Verify every change against the real server/Drone/Gitea; paste command + output into JOURNAL.md. No "should work."
|
||||
- Never weaken, skip, or delete a test to make a run pass. A red test is information.
|
||||
- Only cc-ci is yours to reconfigure. Never push code to recipe repos; never touch production servers/domains. Keep server state Nix-declared and reversible.
|
||||
- 3rd identical failure → stop, record dead-end in DECISIONS.md, change approach or mark blocked.
|
||||
- Credentials: §1.5 is the authoritative map. Provided creds are in /srv/cc-ci/.testenv (TS_AUTH_KEY, GITEA_USERNAME/PASSWORD/URL) and ~/.ssh (cc-ci-root-ed25519). Reach cc-ci with `ssh cc-ci` (root, via the userspace-tailscaled SOCKS proxy on 127.0.0.1:1055); if it fails, restart the proxy per §1.5 before declaring blocked. There is NO ready-made $GITEA_TOKEN — mint one from the bot creds if you want a token.
|
||||
- Secret classes (§4.4), handled differently:
|
||||
• Class A1 EXTERNAL infra inputs (cc-ci SSH/root access, TS auth key, Gitea bot creds, the pre-issued wildcard TLS cert at /var/lib/ci-certs/live/, registry creds; plus the preconfigured DNS/gateway facts): if missing/invalid → STATUS.md ## Blocked and stop. Do NOT improvise/invent. NEVER attempt ACME/DNS-01 for commoninternet.net — the cert is pre-provided and renewed out-of-band; point Traefik's file provider at /var/lib/ci-certs/live/{fullchain.pem,privkey.pem}.
|
||||
• Class A2 INTERNAL infra secrets (Drone RPC, webhook HMAC, Gitea OAuth app, host age key): you GENERATE these yourself — never block on them.
|
||||
• Class B RECIPE APP secrets: NOT a blocker. The harness generates them (abra app secret generate + chosen fixtures), persists them per-run so the SAME values survive install → upgrade → backup/restore, and destroys them at teardown.
|
||||
|
||||
Begin: read /srv/cc-ci/cc-ci-plan/plan.md, then execute §1 Bootstrap, then enter the self-paced loop.
|
||||
1
references/recipe-maintainer
Symbolic link
1
references/recipe-maintainer
Symbolic link
@ -0,0 +1 @@
|
||||
/srv/recipe-maintainer/
|
||||
Reference in New Issue
Block a user