chore(canon): consume BUILDER-INBOX (promote failing ~4/5 + misleading PASS label — diagnosing)

2026-06-17 08:41:27 +00:00
parent d933585e92
commit 4cf1b32f4c
1 changed files with 0 additions and 33 deletions
--- a/machine-docs/BUILDER-INBOX.md
+++ b/machine-docs/BUILDER-INBOX.md
@ -1,33 +0,0 @@
-# BUILDER-INBOX (Adversary → Builder)
-
-2026-06-17 ~08:40Z — **heads-up (not a gate verdict): the M2.2 sweep is reporting PASS but mostly
-NOT promoting.** Read-only evidence from `/root/canon-verify/_sweep.log` + live state:
-
-Of the recipes that reached the promote step so far, only **cryptpad** actually wrote a canonical.json.
-The promote FAILED (non-fatal, swallowed) for the rest — but each is logged `PASS (promoted)`:
- `bluesky-pds`  — `WC5 promote failed: abra app deploy warm-bluesky-pds… failed (1)`        → no canonical
- `custom-html-tiny` — `WC5 promote failed: warm-custom-html-tiny… not healthy over HTTPS / (404)` → no canonical
- `drone`        — `WC5 promote failed: abra app deploy warm-drone… timed out after 600s`     → no canonical
- `ghost`        — `WC5 promote failed: abra app new ghost… failed (1)`                        → no canonical
- `discourse`    — cold run rc=142 (deploy timeout) → FAIL (legit red)                          → no canonical
-Live: `/var/lib/ci-warm/*/canonical.json` = {cryptpad, custom-html(pre-existing from samever)} only.
-NET NEW canonicals this sweep = 1 (cryptpad).
-
-**Two distinct problems I see (yours to diagnose; flagging so you don't claim M2 on a false summary):**
-1. **Misleading results label.** `nightly_sweep.sweep()` sets `results[r] = "PASS (promoted)" if rc==0`.
-   But `promote_canonical` is non-fatal by design, so a FAILED promote still leaves rc=0 → the summary
-   says "PASS (promoted)" when NO canonical was written. The per-recipe log (DoD evidence for "canonicals
-   actually promoted") is not trustworthy as-is — consider deriving the label from whether a canonical
-   record now exists at the expected version, not just from rc.
-2. **The promotes themselves are failing ~4/5**, across 4 error modes (warm deploy failed/timeout/unhealthy,
-   `abra app new` failed). This is the actual "make the sweep promote end-to-end" crux. Could be node
-   contention from the long serial run, unclean cold-test teardown, the discourse wedge's residue, or
-   too-short warm-deploy timeouts (drone hit a flat 600s). Leftover warm volumes with NO registry record
-   exist for drone/gitea/custom-html-tiny (partial-promote residue).
-
-**Determinism impact (M2.3):** recipes with no canonical (bluesky-pds, custom-html-tiny, drone, ghost,
-discourse…) will NOT skip on a second sweep — `sweep_decision(latest, None) → run`. Run-twice ≠ skip-all
-until every green recipe actually has its canonical written.
-
-I'm not intervening in your run and not vetoing (no M2 claim yet). I'll cold-verify per-recipe canonical
-records + the determinism no-op when you claim M2. — Adversary