Files
cc-ci-orchestrator/cc-ci-plan/JOURNAL.md
autonomic-bot bca51071bd refactor: rewrite launchers as Python; add orchestrator JOURNAL.md
Bash scripts are now one-liner wrappers: exec python3 <script>.py "$@"
All logic lives in the Python scripts (pure stdlib, no deps).

launch.py — loops + watchdog:
  Full port of launch.sh: phase sequencing, start/stop/status/logs/watchdog,
  handoff signalling, stall detection, heal_session, heal_orchestrator.
  Cleaner structure: config block → helpers → phase/kickoff/agent/healing/
  handoff/watchdog/main. LOOP_BACKEND + LOOP_MODEL switches throughout.

launch-orchestrator.py — orchestrator session:
  claude path: --resume <id> preserved (conversation survives reboots).
  opencode path: run --attach --title (no --resume; STARTUP_PROMPT orients
  the new session; reads JOURNAL.md for context).
  STARTUP_PROMPT updated to reference JOURNAL.md on startup.

launch-upgrader.py — one-shot upgrade job:
  LOOP_BACKEND / LOOP_MODEL take precedence over UPGRADER_BACKEND / UPGRADER_MODEL.
  Both claude and opencode paths supported.

cc-ci-plan/JOURNAL.md — new orchestrator handoff file:
  Persistent across conversation resets. Documents the handoff format and
  carries the current session's summary: migration complete, phase 5 in
  progress (V3/V7 PASS), phase 4 deferred, open items for next session.

AGENTS.md: step 1 on startup = read JOURNAL.md; step 5 = append on handoff.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 17:50:09 +00:00

83 lines
4.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Orchestrator journal
This file is the **persistent handoff record** for the cc-ci orchestrator. Every orchestrator
session (whether Claude or opencode) reads this on startup and appends to it when handing off or
when something noteworthy happens. It survives conversation resets — it is the memory that
`--resume` can't provide for opencode, and a more readable supplement for Claude sessions.
**On startup:** read this file before doing anything else. The most recent `## Session` entry
is where the previous session left off. Carry that context forward.
**On handoff / end of session:** append a `## Session` block (see format below) summarising
what happened, the current state, and anything the next session needs to know.
**On significant events mid-session:** append a `### Event` sub-entry (no need to wait for
handoff).
---
## Format
```markdown
## Session YYYY-MM-DD HH:MM UTC — <backend> <model>
**Left off:** <one sentence what was the last thing done>
**Phase / loop state:** <phase X [N/11], loops RUNNING/stopped, cc-ci healthy/issue>
**Open items:** <bullet list of anything the next session needs to act on, or "none">
**Notes:** <anything surprising, a decision made, a known blocker, etc.>
### Event HH:MM — <short label>
<brief note>
```
---
## Session 2026-05-31 ~04:00 UTC — Claude Sonnet 4.6
**Left off:** Completed the orchestrator → Hetzner migration (cpx22, server 134487234, public
`168.119.126.100`, tailnet `cc-ci-orchestrator-1` @ `100.84.190.30`). The old Incus VM
(`100.116.55.106`) is still on the tailnet — cold standby, not yet deleted.
**Phase / loop state:** Phases 1c1e, 2w, 2pc, 2, 2b, 3 all DONE. Phase 5 [11/11]
(upgrade-flow verify) in progress — loops running, actively verifying the `!testme`
end-to-end flow on the new Hetzner cc-ci server.
**Open items:**
- Phase 5 is in progress — loops need to finish V1V9 and write `## DONE` to STATUS-5.md.
- Phase 4 (final review/polish) was deliberately **skipped** this session — it is queued
at idx 9 in PHASE_IDX_FILE. Resume it after the weekly Opus credits reset.
- Phase 6 (reconcile-only over all 18 recipe mirrors) and Phase 7 (full upgrade on n8n +
ghost + matrix-synapse) are planned but not yet started — run them after Phase 5 DONE.
- Old Incus orchestrator VM (`cc-ci-orchestrator`, `100.116.55.106`) is still running —
stop it via the b1 Incus API once happy with the Hetzner box. mTLS certs at
`/srv/incus-terraform-nix-vm-creator/terraform-secrets/`.
- DNS: `oc.commoninternet.net` A record → `100.84.190.30` still needs adding (operator step).
**Notes:**
- `cc-ci-loops.service` is **enabled** and wired with `reboot-log.sh` ExecStartPre — a reboot
is a non-event; loops + watchdog auto-resume via RESUME_PHASE=1.
- The cc-ci **server** also moved to Hetzner (server 134485294, `ssh cc-ci`
`100.95.31.88`). It has authenticated Docker Hub pulls and 150 GB disk — the old OOM /
disk-starvation / rate-limit issues are gone.
- All recipe mirrors currently reconcile correctly; no stale open PRs observed.
- `opencode` v1.15.13 installed at `/home/loops/.local/bin/opencode`. Tinfoil API key is in
`.testenv` as `TINFOIL_API_KEY`. Backend switch: `LOOP_BACKEND=opencode
LOOP_MODEL=tinfoil/deepseek-v4-pro RESUME_PHASE=1 cc-ci-plan/launch.sh start`.
- Launcher scripts rewritten to Python (`launch.py`, `launch-orchestrator.py`,
`launch-upgrader.py`); bash wrappers are now one-liners that `exec python3 <script> "$@"`.
### Event 03:13 — migrated from old Incus VM to Hetzner
Loops were started manually during staging (not by the service); first systemd-managed
boot was later this session. `cc-ci-loops.service` now enabled.
### Event 05:23 — phase 3 (results-UX) completed
All R1R8 Adversary-verified, no VETO. Watchdog auto-advanced to phase 4.
### Event 13:22 — phase 4 paused, jumped to phase 5
Operator deferred phase 4 (weekly Opus credits exhausted). Phase idx manually set to 10
(phase 5). Loops restarted on Sonnet.
### Event 17:29 — loops stopped pending restart on different model
Operator paused loops to reconfigure backend (opencode/tinfoil exploration). Phase 5
[11/11] was in progress — loops had verified V1/V2/V3/V7 (custom-html-tiny upgrade GREEN).
Phase idx = 10 (phase 5), loops stopped, watchdog stopped.