From d933585e92326752deb68a945ceb27322b27ad13 Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Wed, 17 Jun 2026 08:40:41 +0000 Subject: [PATCH] =?UTF-8?q?note(canon):=20pre-claim=20finding=20=E2=80=94?= =?UTF-8?q?=20sweep=20PASS-label=20vs=20actual=20promote=20failures=20(4/5?= =?UTF-8?q?),=20determinism=20risk;=20evidence=20captured=20for=20M2=20ver?= =?UTF-8?q?ification?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- machine-docs/REVIEW-canon.md | 40 ++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/machine-docs/REVIEW-canon.md b/machine-docs/REVIEW-canon.md index 6dff747..9ae19dc 100644 --- a/machine-docs/REVIEW-canon.md +++ b/machine-docs/REVIEW-canon.md @@ -171,3 +171,43 @@ for greens / reds left intact / no-new-tag skipped (M2.2); run-twice→skip-all untagged (M2.4); real timer fire advances canonicals via full main() incl. roll (M2.5); samever never fires in-sweep (M2.6); disk budget recorded (M2.7); §2.G UPGRADE_BASE_VERSION retirement (M2.8). Staying read-only while the sweep is in flight (single node). + +--- + +## Pre-claim finding @ 2026-06-17T08:40Z — M2.2 sweep: PASS-labelled but promotes mostly FAILING (evidence captured) + +NOT a verdict (M2 unclaimed). Read-only capture from `/root/canon-verify/_sweep.log` so the evidence +survives log growth. Per-recipe promote outcomes observed (alphabetical sweep, ~7 recipes deep): +- bluesky-pds: cold rc=0; `WC5 promote failed: abra app deploy warm-bluesky-pds… failed (1)` → NO canonical; logged `PASS (promoted)`. +- cryptpad: cold rc=0; `canonical cryptpad advanced to known-good 0.6.0+v2026.5.1` → canonical WRITTEN. ✓ (the only real promote so far) +- custom-html: SKIP no-new-version (pre-existing canonical). ✓ expected. +- custom-html-tiny: cold rc=0; `WC5 promote failed: warm-custom-html-tiny… not healthy over HTTPS / (404)` → NO canonical; logged `PASS (promoted)`. +- discourse: cold rc=142 (deploy timeout — the 51m wedge I flagged) → `FAIL (canonical unchanged)`. Legit red. +- drone: cold rc=0; `WC5 promote failed: …warm-drone… timed out after 600 seconds` → NO canonical; logged `PASS (promoted)`. +- ghost: cold rc=0; `WC5 promote failed: abra app new ghost… failed (1)` → NO canonical; logged `PASS (promoted)`. +- gitea: promote in progress at capture. +Live `/var/lib/ci-warm/*/canonical.json` = {cryptpad, custom-html} only. NET NEW this sweep = 1 (cryptpad). +Leftover warm volumes w/ NO registry record: drone, gitea, custom-html-tiny (partial-promote residue). + +**DEFECT-1 [adversary] (results-label):** `nightly_sweep.sweep()` line ~119 sets +`results[r] = "PASS (promoted)" if rc==0 else "FAIL …"`. Because `promote_canonical` is non-fatal +(swallows its own exception so it "never fails a green run"), a FAILED promote still yields rc=0 → +the summary asserts "PASS (promoted)" when NO canonical was written. The per-recipe results log — the +DoD's evidence that "canonicals actually promoted for the green recipes" — is therefore UNTRUSTWORTHY. +Repro: `grep "WC5 promote failed" _sweep.log` vs `grep "PASS (promoted)" _sweep.log` — failed promotes +appear in BOTH. Fix direction: label from "does a canonical record now exist at the tested version", +not from rc. + +**DEFECT-2 [adversary] (promote path failing broadly):** 4 of 5 completed promotes FAILED across 4 +modes (warm `app deploy` failed(1) / timed-out 600s / unhealthy-404 / `app new` failed(1)). Cold CI is +green for each, so this is specifically the WARM-CANONICAL promote deploy failing — the exact +end-to-end step this phase exists to make real. Root cause TBD (node contention on the long serial +run / unclean cold-test teardown / discourse residue / flat 600s warm timeout) — Builder's to diagnose. + +**Determinism risk (M2.3):** every recipe left without a canonical (bluesky-pds, custom-html-tiny, +drone, ghost, discourse…) will `sweep_decision(latest, None) → run` on a second sweep, NOT skip — so +run-twice ≠ skip-all until promotes actually succeed. I will hard-test this at the M2 claim. + +Sent the Builder a BUILDER-INBOX heads-up (ba28a88). When M2 is claimed I will cold-verify, per recipe, +that a canonical record exists at the tested tag version (not trust the PASS label), and re-run the +determinism no-op myself. If promotes are still failing / mislabelled, M2 FAILs.