Bash scripts are now one-liner wrappers: exec python3 <script>.py "$@"
All logic lives in the Python scripts (pure stdlib, no deps).
launch.py — loops + watchdog:
Full port of launch.sh: phase sequencing, start/stop/status/logs/watchdog,
handoff signalling, stall detection, heal_session, heal_orchestrator.
Cleaner structure: config block → helpers → phase/kickoff/agent/healing/
handoff/watchdog/main. LOOP_BACKEND + LOOP_MODEL switches throughout.
launch-orchestrator.py — orchestrator session:
claude path: --resume <id> preserved (conversation survives reboots).
opencode path: run --attach --title (no --resume; STARTUP_PROMPT orients
the new session; reads JOURNAL.md for context).
STARTUP_PROMPT updated to reference JOURNAL.md on startup.
launch-upgrader.py — one-shot upgrade job:
LOOP_BACKEND / LOOP_MODEL take precedence over UPGRADER_BACKEND / UPGRADER_MODEL.
Both claude and opencode paths supported.
cc-ci-plan/JOURNAL.md — new orchestrator handoff file:
Persistent across conversation resets. Documents the handoff format and
carries the current session's summary: migration complete, phase 5 in
progress (V3/V7 PASS), phase 4 deferred, open items for next session.
AGENTS.md: step 1 on startup = read JOURNAL.md; step 5 = append on handoff.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
configuration.nix:
- systemd.services.opencode-web: one shared opencode server on 127.0.0.1:4096,
EnvironmentFile=/srv/cc-ci/.testenv (TINFOIL_API_KEY), ExecStartPre clears
stale /tmp/opencode so restarts never fail on the EEXIST race.
- services.nginx: reverse-proxy oc.commoninternet.net → localhost:4096,
bound to tailscale IP 100.84.190.30 (tailnet-only, plain HTTP).
DNS: A record oc.commoninternet.net → 100.84.190.30 (operator step).
launch.sh + launch-upgrader.sh:
- Drop per-session ports / OPENCODE_HOST; add OPENCODE_SERVER=http://127.0.0.1:4096.
- opencode backend: agents use `opencode run --attach $OPENCODE_SERVER --title $session`
so each shows up as a named session in the web UI.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Gitea repos renamed:
cc-ci-autonomous-orchestrator → cc-ci-orchestrator
cc-ci-orchestrator → archived-cc-ci-orchestrator
Updated in this workspace:
- README.md, AGENTS.md: repo title
- cc-ci-plan/plan-orchestrator-migration.md: cc-ci-autonomous-orchestrator refs
- cc-ci-plan/plan-repo-consolidation.md: marked complete + Pi remote-update notice
- cc-ci-plan/launch-orchestrator.sh, launch.sh: session naming comment cleanup
NOTE: Pi clone still has the old origin URL. On the Pi, run:
git remote set-url origin https://git.autonomic.zone/recipe-maintainers/cc-ci-orchestrator.git
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
tmux `send-keys -l <long msg>` often leaves the text UNSENT in the input box (the
immediate Enter is swallowed while the TUI ingests the paste). Both now type the
message then retry Enter/C-m until the leading text is no longer in the input box
(= submitted) or a bounded loop gives up.
- msg-loop.sh: standalone reliable messenger for orchestrator use.
- launch.sh ping_session: same retry-submit (loads on next watchdog restart).
Live-tested: delivered first try.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The weekly upgrade run now executes inside a dedicated, remote-control agent
(cc-ci-upgrader) — viewable/steerable at claude.ai/code like the Builder — rather
than buried in headless cron output.
- launch-upgrader.sh: spins up the cc-ci-upgrader tmux session under
--remote-control with a kickoff that runs /upgrade-all (DEFAULT mode) to
completion. On finish the agent STOPS and stays idle (does NOT self-terminate)
so the run + summary stay reviewable in the web UI. `start` = use-or-create:
leaves an in-flight (busy) run alone, else clears a finished/idle/wedged
session and runs fresh; `fresh` always restarts. UPGRADER_ARGS passes flags
(e.g. --dry-run); never --with-tests.
- launch.sh: orchestrator_alive() now also skips the cc-ci-upgrader
remote-control name, so the upgrader job isn't mistaken for the orchestrator.
- upgrade-all skill: documents it runs as the cc-ci-upgrader agent; the weekly
cron invokes `launch-upgrader.sh start` (not /upgrade-all inline).
- Phase 5: V8a verifies the agent lifecycle (launch → run to completion → stay
idle/viewable → next start clears it); V9 stops the verification session.
- cron memory: weekly task = launch-upgrader.sh start at 0 3 * * 6 UTC.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Appended as the LAST phase in the launcher sequence (… 3 4 5). It can only run
once cc-ci is fully built — the !testme-on-recipe-PR flow depends on Phase 3
(results UX) surfacing the run result back on the PR for testme-on-pr.sh to read.
DoD (Adversary cold-verifies): !testme on a recipe PR is the real gate + results
land in the PR (V1); testme-on-pr.sh reads GREEN/RED/PENDING + BUILD url, POST=0
polls without re-triggering (V2); /recipe-upgrade default end-to-end green on a
sandbox recipe, nothing merged (V3); the ≤3 !testme regression loop (V4); stale
test DEFAULT = comment-only, no test edit (V5); --with-tests opens+verifies a
cc-ci test PR, paired (V6); mirror reconcile closes merged/superseded PRs and
main==upstream (V7); /upgrade-all default dry-run + small live run never edits
tests (V8); all verification PRs closed + deploys torn down (V9). Use a sandbox
recipe; never merge; never weaken tests. Watchdog reloaded (seq …3 4 5).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Root cause of the adversary "overrun": stall_check rebooted the instant
now >= WAITING-UNTIL (zero grace), but the loop's own ScheduleWakeup fires AT
that stated time — and the runtime scheduled it ~40s later than the marker
(date-vs-scheduler skew). So the watchdog pre-empted a HEALTHY self-wake by
~37s; the loop wasn't wedged, it was killed just before it woke. That was the
single false reboot at 18:55Z.
Fix: split the two cases cleanly.
- Marker present: reboot only when now > WAITING-UNTIL + STALL_GRACE (180s) —
covers wake+start latency + marker/scheduler skew, so the watchdog only fires
if the self-wake GENUINELY failed.
- No marker: unchanged — reboot when idle >= STALL_IDLE (300s).
Verified post-fix: adversary self-woke on time and re-paced (WAITING-UNTIL
19:19:30Z); no new stall reboots.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The builder wedged at the context limit (garbled output) — alive but matching
none of heal_session's signals (dead/FATAL/limit), so the watchdog left it
stuck. Fix: loops now declare every wait, and the watchdog reboots a wait that
never resumes.
- plan.md §7 + both prompts: cap every wait at 10 min (chunk longer waits);
before going idle, the loop's FINAL line must be `WAITING-UNTIL: <ISO8601 UTC>`
(the resume time, matching its ScheduleWakeup); run /compact proactively at
~80% context to avoid wedging near the limit.
- launch.sh: new stall_check (runs every 30s signal tick) — reboots a loop idle
>= STALL_IDLE (300s) when it has NO current WAITING-UNTIL marker as its last
message OR is past the time the marker named; a healthy paced wait (marker
present, before its time) is left alone. Complements heal_session's
dead/FATAL/limit cases. Reboot is safe — loops re-orient from git + STATUS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The earlier `git add` included an already-`git rm`'d pathspec, so it errored and
staged nothing — launch.sh (3r removal) and .gitignore (track .claude/skills/)
were left uncommitted while the skill files went in via a separate -f add.
Runtime was already correct (watchdog reads the working-tree launch.sh); this
just syncs git HEAD to the working tree.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Deterministic CI stays the primary, AI-free path. Adds a separate on-demand skill (ships in the
cc-ci repo .claude/skills/ci-test-review/) that runs the full suite across all recipes and, per
failure, AI-diagnoses + classifies: recipe PR (+ proposed change) vs CI-server PR vs stale-test;
or 'all passed, recipes+tests up to date' (incl. a latest-version freshness check). Proposes, never
auto-merges (operator-merge rule). Slotted 3 -> 3r -> 4. AI only diagnoses; execution stays
deterministic.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replaces the brittle markdown prose-match ("Gate: … CLAIMED, awaiting Adversary") with detection of
the loops' conventional commit prefixes on origin/main: a new `claim(...)` commit pings the
Adversary; a new `review(...)` commit pings the Builder. Edge-triggered on the origin/main SHA
(append-only — no force-push), no file parsing, can't mis-route. The loops already use these
prefixes consistently; codified as a load-bearing contract in plan.md §6.1 + both prompts so it
stays reliable. INBOX detection unchanged (pushed-state, file-routed).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The handoff pings fired on the writer's LOCAL working-tree write (before push), so the receiver
pulled a stale origin/main, saw "no formal gate", and a clarifying inbox round-trip ensued
(several minutes + wasted turns per handoff). And the gate-id parser read "WC1" as "C1" and could
fire on prose mentions.
Fix (1): handoff_check now `git fetch`es and reads origin/main (what the receiver will pull), via
_wd_fetch_origin + _wd_show_pushed, for STATUS / REVIEW / both INBOXes — a ping only fires once the
claim/verdict is actually pushed, so the receiver's pull always sees it. Eliminates the stale-pull
"premature" dance.
Fix (2): gate-claim detection matches ONLY a formal line (Gate: <id> … CLAIMED, awaiting Adversary)
and edge-triggers on a genuinely-new such line compared whole — no firing on historical
"CLAIMED detail" lines or prose; gate-id is a best-effort label only.
Loops' clones have a credential helper (reads .testenv) so the watchdog's fetch works
non-interactively. Verified.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Operator-directed: pause Phase 2, build the warm-data + --quick system, then resume Phase 2.
- live-warm keycloak (SSO dep, realm-per-run), data-warm canonicals (undeploy keeps volume),
cold = authoritative default. --quick reattaches the canonical, upgrades to PR head, asserts,
and rolls back to the last-known-good snapshot on failure (never loses working data).
- known-good = raw volume copy taken while undeployed (consistent), one per app, advanced ONLY
by green cold runs; a nightly full-cold sweep refreshes canonicals + is a daily regression run.
- launch.sh: insert 2w at the current index (Phase 2 -> resumes after 2w DONE); seq is now
1c 1b 1d 1e 2w 2 2b 3 4.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- heal_session: detect the unrecoverable "thinking/redacted_thinking blocks cannot
be modified" 400 (recurs every turn, session stays alive so the dead-check misses
it) and kill+restart the loop fresh (re-orients from repo). Consolidates the
dead/fatal/limit handling for builder+adversary.
- heal_orchestrator: keep the orchestrator alive too, conflict-safe. Restarts via
launch-orchestrator.sh ONLY when no orchestrator is alive anywhere — liveness
detects both a managed cc-ci-orchestrator tmux session AND a hand-launched
terminal session (any non-loop claude), so it never double-resumes the
conversation (the likely cause of the thinking-block crashes). Kill+restart if
the managed session is wedged on the FATAL error. Toggle: WATCH_ORCHESTRATOR=0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
handoff_check's now="$(grep CLAIMED.*awaiting ... )" returned non-zero when a phase's STATUS
has no claimed-awaiting lines yet (normal early in a phase); under set -euo pipefail that
assignment exited the whole watchdog. Append `|| true` to the now= and cur= command
substitutions. Verified: watchdog survives the handoff tick on a freshly-created STATUS-1c.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make the launcher drive an ordered phase sequence (default 1c then 1b). Each phase has its own
plan + phase-namespaced loop-state files (STATUS-<id>.md/BACKLOG/REVIEW/JOURNAL); the watchdog
auto-transitions when the current phase's STATUS-<id>.md shows ## DONE, and STOPS after the last
phase (writes SEQUENCE-COMPLETE, exits) as a manual gate before Phase 2. start_agent injects a
phase preamble (source-of-truth = phase plan; phase-namespaced state) ahead of the base role
prompt. DONE detection reads the builder's local clone (reliable, no push-lag). Handoff signalling
+ resilience preserved and made phase-scoped (reset baseline on transition).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Adversary got a spurious "gate CLAIMED" ping: STATUS.md keeps historical
"Gate: Mn — CLAIMED, awaiting Adversary" lines after they PASS, and on watchdog restart the
first observation pinged on those already-passed lines. Now track the SET of gate ids on
CLAIMED-awaiting lines and ping only when an id NEWLY appears vs the prior observation, after a
silent baseline. A gate passing (line kept) or evidence edits don't re-ping; restart re-baselines
without pinging. Verified: watchdog restart no longer pings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug: start_watchdog used $0, which breaks when launch.sh is called by a relative path
(the watchdog tmux session cd's into PLAN_DIR, so a relative $0 no longer resolves —
"No such file or directory", watchdog dies instantly). Resolve BASH_SOURCE to an absolute
SELF once and use it for the watchdog self-invocation. Verified: watchdog now starts and
its handoff_check immediately pinged the Adversary about a standing CLAIMED gate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
launch.sh watchdog now runs a fast (~30s) handoff_check alongside the heavy (300s) restart/DONE
check: when the Builder writes a CLAIMED gate it pings the Adversary to verify now; when the
Adversary updates REVIEW.md it pings the Builder to proceed (edge-triggered, reads local clones).
So a pending handoff resolves in <~30s instead of a whole idle interval. Pacing revised: the
Adversary may idle freely when nothing's pending (no pointless re-verify/busy-poll) and is woken
by the watchdog; Builder waits on the ping + a fallback ~2-4m self-poll. kickoff documents the
new "handoff signalling" role.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Planning + launch + setup material for the cc-ci Co-op Cloud recipe CI server:
plan.md (single source of truth), kickoff/launch supervision, and the
Builder/Adversary loop prompts. Secrets (.testenv) and runtime dirs are gitignored.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>