Files
cc-ci-orchestrator/cc-ci-plan/orchestrator-supervision.md
autonomic-bot 37a422bc31 refactor(wake): thin wake prompt -> points at orchestrator-supervision.md
The hourly wake prompt was hardcoding phase 5 / STATUS-5.md and going stale
as the build advanced. Make it a one-line pointer to a maintained doc
(orchestrator-supervision.md) that looks the CURRENT phase up live via
launch.py status — so the wake prompt never needs editing as phases change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 01:37:32 +00:00

42 lines
2.7 KiB
Markdown

# Orchestrator supervision routine
The routine the cc-ci orchestrator follows on each scheduled wake-up. The hourly wake (driven by the
watchdog) types a one-line pointer to THIS file — so the wake prompt itself never needs editing as the
build moves through phases. This doc holds the actual instructions and looks the current phase up live.
Supervise the two worker loops (Builder + Adversary, both on the claude backend), nudge anything
stalled, confirm progress, otherwise stay hands-off. Do NOT make unrelated code changes.
## 1. Current phase — always read it live (never assume a specific phase)
Run `python3 cc-ci-plan/launch.py status`. It reports the CURRENT phase id, its plan file, and its
`STATUS-<id>.md` (from `.phases-spec`). Whatever it says now IS the phase; that phase's plan file (in
`cc-ci-plan/`) is the loops' single source of truth for what they're doing this phase.
## 2. Live-state checks
- builder / adv panes: `tmux capture-pane -pt cc-ci-builder` / `-pt cc-ci-adv`
- watchdog: `tmux capture-pane -pt cc-ci-watchdog` (+ `tail /srv/cc-ci/.cc-ci-logs/watchdog.log`)
- backend sanity: `.loop-backend` = `claude`, `.loop-model` = `sonnet`
- `ssh cc-ci hostname` → CI server reachable
## 3. Keep them moving
- Builder stalled / idle past its WAITING-UNTIL with no work → nudge it to continue the current phase.
- Adversary stale / parked on old evidence → nudge it to re-orient and verify outstanding claims.
- The watchdog heals dead sessions and pings on `claim()`/`review()` commits — only intervene where it
CAN'T: a wedged-but-alive loop, genuine drift, or a loop at high context (≳85%) that should `/compact`
(lossless — state is in git + the phase STATUS/REVIEW files).
- Loop session missing entirely → `RESUME_PHASE=1 LOOP_BACKEND=claude LOOP_MODEL=sonnet python3 cc-ci-plan/launch.py start`.
- If you revised a plan a loop is already working in, re-read it to them (ping the session) — loops read
the plan at kickoff and won't see later edits unless told.
## 4. Completion
- A phase is done when its `STATUS-<id>.md` (in `/srv/cc-ci/cc-ci/machine-docs/`) has a line starting
with `## DONE` (every gate Adversary-verified, no standing VETO). The watchdog auto-advances to the
next phase (`[n/N]` in status).
- When the LAST phase finishes, the watchdog writes `/srv/cc-ci/.cc-ci-logs/SEQUENCE-COMPLETE`, stops the
loops, and exits — so this hourly wake stops too (it lives in the watchdog). On that event: append a
completion note to `cc-ci-plan/JOURNAL.md` and send a proactive PushNotification to the operator.
## 5. Be decisive but minimal
If everything is healthy and active, make no changes — just note the state. If prior-wake work is still
in progress, continue from the live state instead of restarting your analysis.