Operator-directed: pause Phase 2, build the warm-data + --quick system, then resume Phase 2.
- live-warm keycloak (SSO dep, realm-per-run), data-warm canonicals (undeploy keeps volume),
cold = authoritative default. --quick reattaches the canonical, upgrades to PR head, asserts,
and rolls back to the last-known-good snapshot on failure (never loses working data).
- known-good = raw volume copy taken while undeployed (consistent), one per app, advanced ONLY
by green cold runs; a nightly full-cold sweep refreshes canonicals + is a daily regression run.
- launch.sh: insert 2w at the current index (Phase 2 -> resumes after 2w DONE); seq is now
1c 1b 1d 1e 2w 2 2b 3 4.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- heal_session: detect the unrecoverable "thinking/redacted_thinking blocks cannot
be modified" 400 (recurs every turn, session stays alive so the dead-check misses
it) and kill+restart the loop fresh (re-orients from repo). Consolidates the
dead/fatal/limit handling for builder+adversary.
- heal_orchestrator: keep the orchestrator alive too, conflict-safe. Restarts via
launch-orchestrator.sh ONLY when no orchestrator is alive anywhere — liveness
detects both a managed cc-ci-orchestrator tmux session AND a hand-launched
terminal session (any non-loop claude), so it never double-resumes the
conversation (the likely cause of the thinking-block crashes). Kill+restart if
the managed session is wedged on the FATAL error. Toggle: WATCH_ORCHESTRATOR=0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
handoff_check's now="$(grep CLAIMED.*awaiting ... )" returned non-zero when a phase's STATUS
has no claimed-awaiting lines yet (normal early in a phase); under set -euo pipefail that
assignment exited the whole watchdog. Append `|| true` to the now= and cur= command
substitutions. Verified: watchdog survives the handoff tick on a freshly-created STATUS-1c.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make the launcher drive an ordered phase sequence (default 1c then 1b). Each phase has its own
plan + phase-namespaced loop-state files (STATUS-<id>.md/BACKLOG/REVIEW/JOURNAL); the watchdog
auto-transitions when the current phase's STATUS-<id>.md shows ## DONE, and STOPS after the last
phase (writes SEQUENCE-COMPLETE, exits) as a manual gate before Phase 2. start_agent injects a
phase preamble (source-of-truth = phase plan; phase-namespaced state) ahead of the base role
prompt. DONE detection reads the builder's local clone (reliable, no push-lag). Handoff signalling
+ resilience preserved and made phase-scoped (reset baseline on transition).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Adversary got a spurious "gate CLAIMED" ping: STATUS.md keeps historical
"Gate: Mn — CLAIMED, awaiting Adversary" lines after they PASS, and on watchdog restart the
first observation pinged on those already-passed lines. Now track the SET of gate ids on
CLAIMED-awaiting lines and ping only when an id NEWLY appears vs the prior observation, after a
silent baseline. A gate passing (line kept) or evidence edits don't re-ping; restart re-baselines
without pinging. Verified: watchdog restart no longer pings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug: start_watchdog used $0, which breaks when launch.sh is called by a relative path
(the watchdog tmux session cd's into PLAN_DIR, so a relative $0 no longer resolves —
"No such file or directory", watchdog dies instantly). Resolve BASH_SOURCE to an absolute
SELF once and use it for the watchdog self-invocation. Verified: watchdog now starts and
its handoff_check immediately pinged the Adversary about a standing CLAIMED gate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
launch.sh watchdog now runs a fast (~30s) handoff_check alongside the heavy (300s) restart/DONE
check: when the Builder writes a CLAIMED gate it pings the Adversary to verify now; when the
Adversary updates REVIEW.md it pings the Builder to proceed (edge-triggered, reads local clones).
So a pending handoff resolves in <~30s instead of a whole idle interval. Pacing revised: the
Adversary may idle freely when nothing's pending (no pointless re-verify/busy-poll) and is woken
by the watchdog; Builder waits on the ping + a fallback ~2-4m self-poll. kickoff documents the
new "handoff signalling" role.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Planning + launch + setup material for the cc-ci Co-op Cloud recipe CI server:
plan.md (single source of truth), kickoff/launch supervision, and the
Builder/Adversary loop prompts. Secrets (.testenv) and runtime dirs are gitignored.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>