Watchdog handoff signalling: ping the waiting loop on gate-claim / verdict (kill double-idle)
launch.sh watchdog now runs a fast (~30s) handoff_check alongside the heavy (300s) restart/DONE check: when the Builder writes a CLAIMED gate it pings the Adversary to verify now; when the Adversary updates REVIEW.md it pings the Builder to proceed (edge-triggered, reads local clones). So a pending handoff resolves in <~30s instead of a whole idle interval. Pacing revised: the Adversary may idle freely when nothing's pending (no pointless re-verify/busy-poll) and is woken by the watchdog; Builder waits on the ping + a fallback ~2-4m self-poll. kickoff documents the new "handoff signalling" role. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -652,17 +652,21 @@ the *specific* thing. Three cases:
|
||||
1. **Something in flight** (build/deploy/`nixos-rebuild`) → re-check on a short cadence (≈4 min) to
|
||||
stay cache-warm; keep polling *it*, don't treat it as idle, and don't spin on a minutes-long build.
|
||||
2. **Blocked on the *other* loop** — Builder parked at a `CLAIMED` gate awaiting the Adversary, or
|
||||
Adversary waiting for the Builder to fix an `[adversary]` finding → **poll on the short ≈4 min
|
||||
cadence for the counterpart's response; do NOT use the long idle sleep.** A pending handoff is not
|
||||
idleness — the other loop may respond any moment, and if *both* loops long-idle here you get dead
|
||||
wall-clock where neither advances. (This is the common "both waiting" trap — avoid it.)
|
||||
Adversary waiting for the Builder to fix an `[adversary]` finding. **You don't need to busy-poll
|
||||
here: the watchdog signals across the handoff.** The moment the Builder writes a `CLAIMED` gate,
|
||||
the watchdog pings the Adversary to verify *now*; the moment the Adversary updates `REVIEW.md`
|
||||
(verdict/finding), it pings the Builder to proceed (`launch.sh`, ~30 s detection). So you may sleep
|
||||
while blocked and trust the ping — but keep a **fallback self-poll on a modest cadence (~2–4 min)**
|
||||
in case a ping is missed (a dead session is restarted by the watchdog and re-orients from the repo
|
||||
anyway). The goal: a pending handoff resolves in well under a minute, not a whole idle interval.
|
||||
3. **Genuinely idle, nothing pending from either loop** → sleep ~10–15 min, then re-orient.
|
||||
|
||||
Corollary for the Adversary: a standing `CLAIMED` gate is immediate top-priority work (verify it now,
|
||||
don't idle past it); absent a gate, run background break-it probes / re-verify stale D-gates rather
|
||||
than sleeping — so the Adversary is rarely idle while the Builder is active. Corollary for the
|
||||
Builder: prefer keeping an unblocked backlog item in hand so you're not fully blocked on a gate; only
|
||||
hit case 2 when everything is genuinely gated behind the pending verification.
|
||||
Notes: **The Adversary may idle freely when nothing is pending — it should NOT pointlessly re-verify
|
||||
or busy-poll to look busy.** It gets woken by the watchdog the instant the Builder claims a gate, so
|
||||
"start verifying very soon after the Builder waits" is handled by the signal, not by the Adversary
|
||||
spinning. **The Builder** should prefer keeping an unblocked backlog item in hand so it's rarely
|
||||
*fully* blocked on a gate; only hit case 2 when everything is genuinely gated behind the pending
|
||||
verification — and then rely on the watchdog ping (+ fallback poll) rather than a long idle.
|
||||
|
||||
**Anti-drift guards.**
|
||||
- Cap retries: if an approach fails 3× the same way, stop, write the dead-end in `DECISIONS.md`,
|
||||
|
||||
Reference in New Issue
Block a user