Commit Graph

17 Commits

Author SHA1 Message Date
e144354668 loops: mandate machine-docs/ for ALL coordination files (kickoff/prompts/plan/AGENTS)
Recent phases wrote STATUS/BACKLOG/REVIEW/JOURNAL to the repo ROOT because
build_kickoff + plan.md's tree used bare filenames, even though the loops'
AGENTS.md + INBOX/DECISIONS/DEFERRED conventions already said machine-docs/.
Make machine-docs/ the single mandated home everywhere: build_kickoff now
emits machine-docs/ paths + an explicit FILE-LOCATION RULE; both loop prompts
and plan.md (tree + seed step) updated; orchestrator AGENTS.md documents +
enforces it. resolve_state/INBOX handoff already read machine-docs/ first.
2026-06-11 20:56:24 +00:00
3fa3178546 watchdog: one-shot /upgrade-all trigger on phase-sequence completion
When LOG_DIR/.run-upgrade-on-complete exists, the watchdog launches
launch-upgrader.py start the moment the last phase reaches ## DONE (then
consumes the flag). Lets the operator replace a scheduled weekly cron run with
'run as soon as the current phase queue finishes' — used tonight: the
cc-ci-upgrade-all.timer was stopped (stamp forwarded past tonight's slot) and
this flag set instead.
2026-06-11 20:49:54 +00:00
4275adc4a5 watchdog: phase_done ignores placeholder '## DONE' sections (skipped mailu)
A Builder scaffolded 'STATUS-mailu.md' with a '## DONE / Not yet. Written
here only when ...' placeholder section; phase_done's startswith('## DONE')
matched it and auto-advanced past mailu without any of its work being done
(no recipe PR, no claim, no review). Harden phase_done: a '## DONE' heading
counts only when its first non-empty body line is not a placeholder/negation
(Not yet / pending / TBD / when all / <...> etc). Verified against all shipped
STATUS files (real DONEs still detected; mailu placeholder rejected).
2026-06-11 18:20:21 +00:00
211b4e231c launch: per-phase model override (.loop-model[-adv]-<pid>)
Lets a single phase pin a different model, read fresh each role_model call so
a phase transition flips it automatically with no watchdog bounce. Operator
wants builder on opus for the complex dstamp phase, reverting to sonnet from
mailu on: .loop-model-dstamp=opus while base .loop-model stays sonnet.
2026-06-11 16:15:18 +00:00
969eb60df1 watchdog: probe-resumed tick returns True — don't evaluate stale pane after resume
The tick whose probe resumed a session was continuing into stall logic with
its pre-resume pane capture; a 4h-old WAITING-UNTIL in that stale data got
the freshly-resumed adversary kill+rebooted (05:52). Treat probe-resume as
handled-this-tick; the next 30s tick sees the live session.
2026-06-11 05:53:44 +00:00
5ea17fca21 watchdog: fix limit-probe self-match + scrollback dedupe wedge; plan(lvl5): badge shows level only
Night-watch findings (monthly-spend-limit window, ~01:49-04:45):
- probe text said 'usage limit' which matches LIMIT_RE, so a submitted probe
  kept limited_now true forever -> reworded to 'quota window' with a CAUTION
  note (nudge text must never match LIMIT_RE)
- dedupe scanned all 40 captured lines, so once a probe scrolled into the
  conversation no further probe ever fired (builder/adv frozen at nudges=1,
  orchestrator probes degraded to hourly riding the wake scroll) -> dedupe
  now only checks the bottom 8 lines (input area)
Core invariant HELD: zero kill+reboots during the limit window.

plan(lvl5): operator addition - the top-corner level badge (card, dashboard
pill, badge SVG) shows only the level number+color, zero capping info; the
inline per-rung table keeps intentional-skip/unverified detail.
2026-06-11 05:52:26 +00:00
2e1ab8d384 watchdog: hourly orchestrator wake fires even during a limit window
Operator request: the hourly supervision prompt should land regardless of
limit state, as a fallback that keeps things on track if the limit-state
machinery ever breaks. If the limit is genuinely still in force the wake is
harmless (the banner just re-prints and limit_tick re-arms); once it lifts,
the queued wake doubles as a resume nudge.
2026-06-11 01:00:29 +00:00
d6e1a704da watchdog: parse limit-reset time, never reboot limit-stalled sessions; rename orch session
Replace the blind every-300s 'limit appears lifted' nudge (claude) and the
opencode-only _maybe_nudge_limit with one unified limit_tick state machine:

- parse the reset time from the limit banner (last match wins; stale banners
  whose time already passed fall back rather than waiting ~a day)
- arm a quiet window until reset+45s; parse failure -> flat 5-minute probe
  loop (operator-specified; not exponential backoff)
- while armed, suppress ALL healing: a limit-stalled session is NEVER
  kill+rebooted (this was the conc-phase churn: claude limit stalls fell
  through to the generic idle reboot, losing the banner and re-hitting
  the limit fresh)
- at window end send ONE nudge as a self-verifying probe: spinner clears
  the state; a re-printed banner re-arms from the fresh reset time
- dedupe: never stack a probe while our own text is visible in the pane
- state persisted per session in LOG_DIR (.limited-<session>) so watchdog
  restarts keep the window
- orchestrator gets the same treatment: limit_tick in heal_orchestrator,
  a per-signal-tick orch_limit_check, and hourly wakes deferred during
  limit windows
- loud WARNING at 3 probes, then continue flat probes forever

Also rename the orchestrator session default cc-ci-orchestrator-vm ->
cc-ci-orchestrator (launch.py ORCH_SESSION, launch-orchestrator.py SESSION,
docs/scripts references).
2026-06-11 00:55:07 +00:00
e0c9f23391 feat(launch): ADV_MODEL — per-role model override for the Adversary loop 2026-06-10 04:03:35 +00:00
c0852d2302 feat(logs): readable greppable per-agent transcript logs (agent-log.py)
The raw 'tmux pipe-pane' logs are TUI-escape soup (the 191MB builder log).
agent-log.py renders Claude's own JSONL transcript into a clean one-event-
per-line <agent>.clean.log — read-only on a file the agent writes anyway, so
zero agent slowdown and zero extra tokens. Resolves each agent's transcript
(disambiguating the shared project dir by kickoff signature; tracks restarts).
'follow-all' runs as the cc-ci-cleanlogs session, wired into launch.py start
so it comes up with the loops. render/tail subcommands for ad-hoc use.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 04:35:17 +00:00
2b617ba19f feat(launch): persist PHASES_SPEC to .phases-spec (status/watchdog/reboot agree)
Mirror the .loop-backend pattern: env wins, else the persisted file, else
the default build sequence. Without this, a custom single-phase run was
invisible to bare 'launch.py status' and would NOT survive a reboot (the
service has no PHASES_SPEC env). Now the current phase set is durable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:17:34 +00:00
d349656c3b feat(launch): forward PHASES_SPEC/backend to watchdog; mark plan Phase 4 as operator gate
The watchdog is spawned into the existing tmux server and didn't reliably
inherit a custom PHASES_SPEC — it would fall back to the default 11-phase
spec and mis-detect completion. Forward PHASES_SPEC/PHASE_IDX_FILE/
LOOP_BACKEND/LOOP_MODEL explicitly in the watchdog command so custom
single-phase runs (like the mirror-enroll plan) work end-to-end. Also make
the mirror-enroll plan's live-host-deploy step an explicit claim-and-wait
operator gate for the loops.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 00:15:42 +00:00
ca6e68c08d feat(orchestrator): fold hourly supervision wake into the watchdog
The standalone ai-progress-monitor.sh waker pinged a hardcoded
orchestrator session every 15m. Move that into the watchdog loop:
ORCH_WAKE_INTERVAL (default 3600s) types the supervision prompt into
the live orchestrator session, retrying each tick until it lands so a
busy or briefly-absent orchestrator is never interrupted and no hour is
skipped. Delete the now-redundant waker script; the prompt file is now
driven by the watchdog. Reboot-safe by inheritance (the watchdog is
started by cc-ci-loops.service).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 21:46:20 +00:00
24bf379b5b feat(assistant): add opencode launcher and phase 6/7 plans 2026-06-01 12:59:03 +00:00
3412100240 fix(opencode): all issues from first live run resolved
1. API key: opencode doesn't support env: substitution in apiKey — write
   actual key value to ~/.config/opencode/opencode.jsonc at setup time
   (file is not committed to git; key sourced from .testenv).
2. Permission system: add permission:"allow" to opencode config (equivalent
   to --dangerously-skip-permissions) to avoid interactive prompts.
3. Submit key: opencode TUI uses Enter (return) to submit; Ctrl+S not
   needed. ping_session already uses Enter — keep as is.
4. Startup timing: bump opencode TUI init wait from 4s to 8s so the TUI
   is fully connected to the server before bootstrap is sent.
5. Backend persistence: LOOP_BACKEND/LOOP_MODEL written to .loop-backend /
   .loop-model so the watchdog uses them when restarting dead sessions.

All tested: both builder and adversary sessions alive, deepseek-v4-pro
processing kickoffs via tinfoil inference.tinfoil.sh, no API/permission
errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 18:21:10 +00:00
cd5e645427 fix(opencode): use inference.tinfoil.sh + attach TUI + NO_COLOR
Three fixes discovered during first live run:
- inference host is inference.tinfoil.sh not api.tinfoil.sh (control plane
  only serves /v1/models, not /v1/chat/completions)
- opencode run exits after one turn; switch to opencode attach for the
  persistent TUI, then ping_session sends the kickoff prompt
- NO_COLOR=1 suppresses the first-run interactive theme picker

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 17:56:06 +00:00
bca51071bd refactor: rewrite launchers as Python; add orchestrator JOURNAL.md
Bash scripts are now one-liner wrappers: exec python3 <script>.py "$@"
All logic lives in the Python scripts (pure stdlib, no deps).

launch.py — loops + watchdog:
  Full port of launch.sh: phase sequencing, start/stop/status/logs/watchdog,
  handoff signalling, stall detection, heal_session, heal_orchestrator.
  Cleaner structure: config block → helpers → phase/kickoff/agent/healing/
  handoff/watchdog/main. LOOP_BACKEND + LOOP_MODEL switches throughout.

launch-orchestrator.py — orchestrator session:
  claude path: --resume <id> preserved (conversation survives reboots).
  opencode path: run --attach --title (no --resume; STARTUP_PROMPT orients
  the new session; reads JOURNAL.md for context).
  STARTUP_PROMPT updated to reference JOURNAL.md on startup.

launch-upgrader.py — one-shot upgrade job:
  LOOP_BACKEND / LOOP_MODEL take precedence over UPGRADER_BACKEND / UPGRADER_MODEL.
  Both claude and opencode paths supported.

cc-ci-plan/JOURNAL.md — new orchestrator handoff file:
  Persistent across conversation resets. Documents the handoff format and
  carries the current session's summary: migration complete, phase 5 in
  progress (V3/V7 PASS), phase 4 deferred, open items for next session.

AGENTS.md: step 1 on startup = read JOURNAL.md; step 5 = append on handoff.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 17:50:09 +00:00