Files
cc-ci-orchestrator/cc-ci-plan/JOURNAL.md
autonomic-bot bca51071bd refactor: rewrite launchers as Python; add orchestrator JOURNAL.md
Bash scripts are now one-liner wrappers: exec python3 <script>.py "$@"
All logic lives in the Python scripts (pure stdlib, no deps).

launch.py — loops + watchdog:
  Full port of launch.sh: phase sequencing, start/stop/status/logs/watchdog,
  handoff signalling, stall detection, heal_session, heal_orchestrator.
  Cleaner structure: config block → helpers → phase/kickoff/agent/healing/
  handoff/watchdog/main. LOOP_BACKEND + LOOP_MODEL switches throughout.

launch-orchestrator.py — orchestrator session:
  claude path: --resume <id> preserved (conversation survives reboots).
  opencode path: run --attach --title (no --resume; STARTUP_PROMPT orients
  the new session; reads JOURNAL.md for context).
  STARTUP_PROMPT updated to reference JOURNAL.md on startup.

launch-upgrader.py — one-shot upgrade job:
  LOOP_BACKEND / LOOP_MODEL take precedence over UPGRADER_BACKEND / UPGRADER_MODEL.
  Both claude and opencode paths supported.

cc-ci-plan/JOURNAL.md — new orchestrator handoff file:
  Persistent across conversation resets. Documents the handoff format and
  carries the current session's summary: migration complete, phase 5 in
  progress (V3/V7 PASS), phase 4 deferred, open items for next session.

AGENTS.md: step 1 on startup = read JOURNAL.md; step 5 = append on handoff.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 17:50:09 +00:00

4.2 KiB
Raw Blame History

Orchestrator journal

This file is the persistent handoff record for the cc-ci orchestrator. Every orchestrator session (whether Claude or opencode) reads this on startup and appends to it when handing off or when something noteworthy happens. It survives conversation resets — it is the memory that --resume can't provide for opencode, and a more readable supplement for Claude sessions.

On startup: read this file before doing anything else. The most recent ## Session entry is where the previous session left off. Carry that context forward.

On handoff / end of session: append a ## Session block (see format below) summarising what happened, the current state, and anything the next session needs to know.

On significant events mid-session: append a ### Event sub-entry (no need to wait for handoff).


Format

## Session YYYY-MM-DD HH:MM UTC — <backend> <model>
**Left off:** <one sentence  what was the last thing done>
**Phase / loop state:** <phase X [N/11], loops RUNNING/stopped, cc-ci healthy/issue>
**Open items:** <bullet list of anything the next session needs to act on, or "none">
**Notes:** <anything surprising, a decision made, a known blocker, etc.>

### Event HH:MM — <short label>
<brief note>

Session 2026-05-31 ~04:00 UTC — Claude Sonnet 4.6

Left off: Completed the orchestrator → Hetzner migration (cpx22, server 134487234, public 168.119.126.100, tailnet cc-ci-orchestrator-1 @ 100.84.190.30). The old Incus VM (100.116.55.106) is still on the tailnet — cold standby, not yet deleted.

Phase / loop state: Phases 1c1e, 2w, 2pc, 2, 2b, 3 all DONE. Phase 5 [11/11] (upgrade-flow verify) in progress — loops running, actively verifying the !testme end-to-end flow on the new Hetzner cc-ci server.

Open items:

  • Phase 5 is in progress — loops need to finish V1V9 and write ## DONE to STATUS-5.md.
  • Phase 4 (final review/polish) was deliberately skipped this session — it is queued at idx 9 in PHASE_IDX_FILE. Resume it after the weekly Opus credits reset.
  • Phase 6 (reconcile-only over all 18 recipe mirrors) and Phase 7 (full upgrade on n8n + ghost + matrix-synapse) are planned but not yet started — run them after Phase 5 DONE.
  • Old Incus orchestrator VM (cc-ci-orchestrator, 100.116.55.106) is still running — stop it via the b1 Incus API once happy with the Hetzner box. mTLS certs at /srv/incus-terraform-nix-vm-creator/terraform-secrets/.
  • DNS: oc.commoninternet.net A record → 100.84.190.30 still needs adding (operator step).

Notes:

  • cc-ci-loops.service is enabled and wired with reboot-log.sh ExecStartPre — a reboot is a non-event; loops + watchdog auto-resume via RESUME_PHASE=1.
  • The cc-ci server also moved to Hetzner (server 134485294, ssh cc-ci100.95.31.88). It has authenticated Docker Hub pulls and 150 GB disk — the old OOM / disk-starvation / rate-limit issues are gone.
  • All recipe mirrors currently reconcile correctly; no stale open PRs observed.
  • opencode v1.15.13 installed at /home/loops/.local/bin/opencode. Tinfoil API key is in .testenv as TINFOIL_API_KEY. Backend switch: LOOP_BACKEND=opencode LOOP_MODEL=tinfoil/deepseek-v4-pro RESUME_PHASE=1 cc-ci-plan/launch.sh start.
  • Launcher scripts rewritten to Python (launch.py, launch-orchestrator.py, launch-upgrader.py); bash wrappers are now one-liners that exec python3 <script> "$@".

Event 03:13 — migrated from old Incus VM to Hetzner

Loops were started manually during staging (not by the service); first systemd-managed boot was later this session. cc-ci-loops.service now enabled.

Event 05:23 — phase 3 (results-UX) completed

All R1R8 Adversary-verified, no VETO. Watchdog auto-advanced to phase 4.

Event 13:22 — phase 4 paused, jumped to phase 5

Operator deferred phase 4 (weekly Opus credits exhausted). Phase idx manually set to 10 (phase 5). Loops restarted on Sonnet.

Event 17:29 — loops stopped pending restart on different model

Operator paused loops to reconfigure backend (opencode/tinfoil exploration). Phase 5 [11/11] was in progress — loops had verified V1/V2/V3/V7 (custom-html-tiny upgrade GREEN). Phase idx = 10 (phase 5), loops stopped, watchdog stopped.