Replace the blind every-300s 'limit appears lifted' nudge (claude) and the
opencode-only _maybe_nudge_limit with one unified limit_tick state machine:
- parse the reset time from the limit banner (last match wins; stale banners
whose time already passed fall back rather than waiting ~a day)
- arm a quiet window until reset+45s; parse failure -> flat 5-minute probe
loop (operator-specified; not exponential backoff)
- while armed, suppress ALL healing: a limit-stalled session is NEVER
kill+rebooted (this was the conc-phase churn: claude limit stalls fell
through to the generic idle reboot, losing the banner and re-hitting
the limit fresh)
- at window end send ONE nudge as a self-verifying probe: spinner clears
the state; a re-printed banner re-arms from the fresh reset time
- dedupe: never stack a probe while our own text is visible in the pane
- state persisted per session in LOG_DIR (.limited-<session>) so watchdog
restarts keep the window
- orchestrator gets the same treatment: limit_tick in heal_orchestrator,
a per-signal-tick orch_limit_check, and hourly wakes deferred during
limit windows
- loud WARNING at 3 probes, then continue flat probes forever
Also rename the orchestrator session default cc-ci-orchestrator-vm ->
cc-ci-orchestrator (launch.py ORCH_SESSION, launch-orchestrator.py SESSION,
docs/scripts references).
One root doc maps every agent (Builder, Adversary, Orchestrator, Assistant,
Upgrader) -> its prompt + plan, with the watchdog and git coordination
protocol as the subtlety beneath. Fold the orchestrator supervision routine
into it (remove orchestrator-supervision.md). The hourly wake prompt and
AGENTS.md now just point at orchestration.md.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>