diff --git a/REVIEW-rcust.md b/REVIEW-rcust.md index b55fbb3..aab4c31 100644 --- a/REVIEW-rcust.md +++ b/REVIEW-rcust.md @@ -307,3 +307,59 @@ ports), filtering for non-mechanical error-handling (raise/assert/except/exit/ti Net: exactly ONE accidental hook-port regression (lasuite-drive), now under approved fix. No other best-effort↔fatal flips. This audit closes the M1-method gap for the hook bodies. + +--- + +### M2 proof-run independent analysis (cold, Adversary) @2026-06-10T23:53Z + +M2 is NOT yet claimed by the Builder; this is my independent read of the proof runs sitting on +cc-ci (`/var/lib/cc-ci-runs/{m2b-*,ab-*-oldmain}`), parsed myself via jq (NOT trusting Builder +narrative). The 6 first-sweep mismatches break down as follows. + +**Confirmed root fact — REF MISMATCH is real (I verified, not taken on faith).** Every baseline +matrix run used a *PR-head* ref; the first M2.3 sweep used each mirror's *default-branch head* — a +different commit. Independently confirmed via `results.json.ref`: +| recipe | baseline run/ref/level | sweep ref/level | +|---|---|---| +| discourse | 184 / 7ae7b0f76efb / L4 | 7d53d4ec390f / L2 | +| plausible | 308 / 13458fac56a1 / L4 | da159375d89a / L2 | +| mattermost-lts | 196 / a333e31a6002 / L4 | 41c9eb8e5f34 / L2 | +| immich | 307 / 107d7220adce / L4 | 7eb3937a82d0 / L2 | +| lasuite-drive | 189 / ffa7d585afa2 / L5 | f4135d78201e / L0 | +So the sweep was NOT apples-to-apples vs the baseline matrix. Reconciliation requires either +(a) re-run at the baseline ref on new main == baseline level, or (b) A/B same-ref old-vs-new main +== same level. Status per recipe: + +- **immich** — m2b-immich (new main, baseline ref 107d7220adce) = **L4 == baseline L4. CLEAN.** +- **mattermost-lts** — m2b (new main, a333e31a6002) = **L4 == baseline L4. CLEAN.** +- **plausible** — m2b (new main, 13458fac56a1) = **L4 == baseline L4. CLEAN.** + → these three: restructure proven INNOCENT (baseline ref reproduces baseline level on merged main). +- **bluesky-pds** — ab-bluesky-pds-oldmain (OLD main, b2d86efba3f1) = L0 == new-main sweep L0 at + same ref → restructure-NEUTRAL at the sweep ref. (Baseline is "L4-equiv, pre-results-era", no run + id — softer baseline; A/B neutrality is the available evidence.) +- **discourse — NOT yet clean. OPEN.** Two *distinct* flake modes seen, and the A/B was run at the + wrong ref to close the gap: + - baseline 184 (OLD main, 7ae7b0f): all pass → L4. + - m2b-discourse (NEW main, SAME ref 7ae7b0f): **upgrade FAILED**, HC1 guard fired — + "upgrade deployed chaos commit 'eb96de94+U', not intended PR-head '7ae7b0f76efb' — re-checkout + to code-under-test failed (HC1)" → L1. ← same-ref old=L4 vs new=L1 discrepancy, UNexplained. + - ab-discourse-oldmain (OLD main, 7d53d4ec): **restore FAILED** (ci_marker truncated-dump race) + → L2 == new-main sweep L2 at that ref → neutrality proven, but for the RESTORE mode at the + DEFAULT-head ref, NOT for the L1/upgrade-HC1 mode at the baseline ref. + - Net: the clean A/B (ref 7ae7b0f on OLD main vs NEW main) that would explain L4→L1 was NOT run. + The upgrade re-checkout/HC1 path lives in run_recipe_ci.py/lifecycle which the meta-param + threading DID touch — so "pre-existing flake" is plausible but UNPROVEN here. To clear: run + discourse @7ae7b0f on OLD main (does it deterministically reproduce L4, or also flake to L1?), + and/or repeat @7ae7b0f on new main to characterise the HC1 re-checkout as a race. The HC1 guard + FIRING (not silently passing the wrong commit) is the safety net working — good — but it means + the upgrade did not exercise the PR code, so the run is inconclusive, not a clean baseline match. +- **lasuite-drive** — fix-forward 1357544 (restore best-effort bucket poll) landed; needs a fresh + L5 run at the baseline ref ffa7d585afa2 on merged main to confirm baseline. m2rr/earlier runs + predate or used the default head — NOT yet a clean baseline match. OPEN. + +**M2 disposition: still OPEN — no PASS.** 3/6 cleanly reconciled (immich/mattermost/plausible); +bluesky neutral-at-sweep-ref; discourse + lasuite-drive NOT yet closed. I will require, at the M2 +claim: (1) discourse same-ref A/B (or repeat) explaining L4→L1; (2) a clean lasuite-drive L5 at +baseline ref; (3) my own cold re-parse of every per-recipe level vs baseline; (4) the M2.4 +customization-executed spot-greps; (5) zero leaked apps. Recorded a BUILDER-INBOX heads-up on the +discourse-HC1 gap so it is addressed in the claim, not glossed as "the restore flake". diff --git a/machine-docs/BUILDER-INBOX.md b/machine-docs/BUILDER-INBOX.md new file mode 100644 index 0000000..14db8ee --- /dev/null +++ b/machine-docs/BUILDER-INBOX.md @@ -0,0 +1,26 @@ +# Builder inbox — from Adversary @2026-06-10T23:53Z (M2 proof-run heads-up, non-gate) + +I cold-parsed the proof runs on cc-ci (m2b-*, ab-*-oldmain) myself. Good news first: +immich / mattermost-lts / plausible all reproduce **baseline L4 at the baseline ref on merged +main** (m2b-*) — restructure proven innocent for those three. bluesky-pds is restructure-neutral +at the sweep ref (ab-oldmain L0 == sweep L0). + +**Two recipes are NOT yet cleanly reconciled — please close before you claim M2:** + +1. **discourse — the L4→L1 same-ref discrepancy is the gap.** Your A/B (ab-discourse-oldmain) + ran ref **7d53d4ec** (default head) → L2 old == L2 new, which proves neutrality for the *restore* + race at THAT ref. But m2b-discourse ran the *baseline* ref **7ae7b0f** on new main and got **L1 + via an UPGRADE HC1 failure** ("deployed chaos commit 'eb96de94+U', not PR-head 7ae7b0f — re-checkout + failed"), whereas baseline 184 at that SAME ref was L4. That's a different stage/mode than the + restore race, and there is no same-ref A/B for it. The upgrade re-checkout path is in + run_recipe_ci.py/lifecycle, which your meta-param threading touched — so I can't accept + "pre-existing flake" on faith here. Please run discourse @**7ae7b0f76efb** on OLD main (pre-merge + commit) — if it deterministically gives L4, that's a new-main regression to root-cause; if it + also flakes to L1, that characterises the HC1 re-checkout as a race. A couple repeats @7ae7b0f on + new main would also help. I'll cold re-verify whatever you produce. + +2. **lasuite-drive** — your fix-forward 1357544 landed AFTER the m2rr/sweep runs. I need a fresh + **L5 run at the baseline ref ffa7d585afa2** on merged main (post-1357544) to confirm baseline. + +Not blocking your other M2.4 spot-grep work — just don't let discourse get folded into "the restore +flake cluster" in the claim; it's now also an upgrade-recheckout mode at the baseline ref.