Files

autonomic-bot bca51071bd refactor: rewrite launchers as Python; add orchestrator JOURNAL.md

Bash scripts are now one-liner wrappers: exec python3 <script>.py "$@"
All logic lives in the Python scripts (pure stdlib, no deps).

launch.py — loops + watchdog:
  Full port of launch.sh: phase sequencing, start/stop/status/logs/watchdog,
  handoff signalling, stall detection, heal_session, heal_orchestrator.
  Cleaner structure: config block → helpers → phase/kickoff/agent/healing/
  handoff/watchdog/main. LOOP_BACKEND + LOOP_MODEL switches throughout.

launch-orchestrator.py — orchestrator session:
  claude path: --resume <id> preserved (conversation survives reboots).
  opencode path: run --attach --title (no --resume; STARTUP_PROMPT orients
  the new session; reads JOURNAL.md for context).
  STARTUP_PROMPT updated to reference JOURNAL.md on startup.

launch-upgrader.py — one-shot upgrade job:
  LOOP_BACKEND / LOOP_MODEL take precedence over UPGRADER_BACKEND / UPGRADER_MODEL.
  Both claude and opencode paths supported.

cc-ci-plan/JOURNAL.md — new orchestrator handoff file:
  Persistent across conversation resets. Documents the handoff format and
  carries the current session's summary: migration complete, phase 5 in
  progress (V3/V7 PASS), phase 4 deferred, open items for next session.

AGENTS.md: step 1 on startup = read JOURNAL.md; step 5 = append on handoff.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-31 17:50:09 +00:00

5.3 KiB

Raw Blame History

cc-ci-orchestrator — AGENTS.md

This folder is the orchestrator workspace for building the cc-ci Co-op Cloud recipe CI server. It holds the plan, the launch/supervision tooling, and the two loop prompts. The actual CI project (NixOS config, test runner, recipe tests) lives in a separate repo the loops create at git.autonomic.zone/recipe-maintainers/cc-ci — do not confuse the two.

Three roles (don't conflate them)

Orchestrator — this session/role. Supervises: checks in on the two loops, reads their logs/STATUS, makes changes to the plan/prompts, restarts loops, and owns the VM-level fallback. It is separate from the loops and is the only role that should power-cycle/recreate the VM.
Builder loop — builds the CI server (cc-ci-plan/prompts/builder.md).
Adversary loop — independently disbelieves/verifies (cc-ci-plan/prompts/adversary.md).

The two loops coordinate only through the cc-ci git repo (see plan.md §6.1). The orchestrator watches from outside.

On startup: read the journal, announce yourself, report reboots

Every time you (the orchestrator) start or resume:

Read cc-ci-plan/JOURNAL.md — the most recent ## Session entry is where the previous session left off. This is the persistent handoff record; read it before anything else.
Read cc-ci-plan/REBOOTS.md (count entries) and run cc-ci-plan/launch.sh status (current phase + whether loops/watchdog are running).
PushNotification (proactive): "cc-ci orchestrator online — phase X, loops+watchdog running; N reboots logged (last )."
If loops are down, relaunch: RESUME_PHASE=1 cc-ci-plan/launch.sh start.
On handoff / end of session: append a ## Session block to JOURNAL.md summarising what happened, current state, and open items (see format in that file).

Reboot resilience is handled by cc-ci-loops.service (system unit): on boot it logs the reboot to REBOOTS.md (boot_id-gated) and runs launch.sh start with RESUME_PHASE=1, so the loops + watchdog auto-resume the saved phase. The orchestrator session itself is NOT auto-started — the operator reconnects to it (that's why the startup notification matters). The orchestrator now runs on a Hetzner cpx22 cloud server (cc-ci-orchestrator-1, tailnet 100.84.190.30, public 168.119.126.100, flake host cc-ci-orchestrator-hetzner) — see cc-ci-plan/plan-orchestrator-hetzner-migration.md. The earlier Pi→Incus-VM move is the historical cc-ci-plan/plan-orchestrator-migration.md. Rebuild this host with nixos-rebuild switch --flake .#cc-ci-orchestrator-hetzner from /srv/cc-ci-orch.

Keep the orchestrator open, under remote-control

Run this session as a long-lived interactive session with --remote-control so the operator can check in on the loops and steer/restart things from claude.ai/code (or the Claude mobile app) without being at the terminal.

Already in the session? Just run /remote-control — it attaches claude.ai/code to the live conversation (no exit, no resume needed).
Starting fresh: claude --remote-control 'autonomous-orchestrator' --dangerously-skip-permissions
Resuming this orchestrator later (history preserved):
```
claude --resume autonomous-orchestrator --remote-control "autonomous-orchestrator" --dangerously-skip-permissions
```
Note the two names are different: --resume <name|id> restores this conversation (the name set via -n/--name, shown in the /resume picker); the --remote-control [name] value is only the web display label and resumes nothing. The conversation persists on disk across exits; remote control itself only stays "connected" while the local process is alive (resume + re-enable to get it back after a full exit).

Use it to: tail loop logs (cc-ci-plan/launch.sh logs builder|adversary|watchdog), inspect STATUS.md/REVIEW.md in the cc-ci repo, edit the plan or prompts, restart a stuck loop, or power-cycle/recreate the cc-ci VM (see cc-ci-plan/kickoff.md → "Fallback: restart/recreate the cc-ci VM"). The orchestrator is the human's steering wheel; the loops are the engine.

Launch & supervise the loops

Source of truth for the loops: cc-ci-plan/plan.md (mission, Definition of Done, §1.5 credential map, §6 two-agent protocol, §7 loop discipline).
Launch/supervision guide: cc-ci-plan/kickoff.md.
cc-ci-plan/launch.sh start → both loops (interactive --remote-control in tmux) + a watchdog. tmux is installed; launch.sh defaults now point at /srv/cc-ci/....

Access & credentials (pointers only — values are gitignored)

.testenv (NOT committed): Tailscale auth key + Gitea bot creds. Load with set -a; . .testenv; set +a (never echo the values).
cc-ci: ssh cc-ci (root) directly — the orchestrator VM is a direct tailnet peer (100.90.116.4). No proxy. Key: ~/.ssh/cc-ci-root-ed25519. If unreachable, check tailscale status.
Incus/VM fallback: mTLS certs at /srv/incus-terraform-nix-vm-creator/terraform-secrets/; b1 is on the same tailnet (reach via the same proxy). See kickoff "Fallback".
Full credential map + how to use each: plan.md §1.5.

Hard rule

Never commit secret values. .testenv, *.tfstate, *.key/*.pem, and the loop runtime/clone dirs are gitignored. Reference secret locations, never their contents (plan.md §9).

5.3 KiB Raw Blame History