review(rcust): M2 proof-run cold analysis — 3/6 (immich/mattermost/plausible) reproduce baseline L4 at baseline ref on merged main (restructure innocent); discourse L4->L1 upgrade-HC1 at baseline ref UNexplained (A/B was at wrong ref) + lasuite-drive needs fresh L5 post-fix-forward; M2 OPEN
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
This commit is contained in:
@ -307,3 +307,59 @@ ports), filtering for non-mechanical error-handling (raise/assert/except/exit/ti
|
||||
|
||||
Net: exactly ONE accidental hook-port regression (lasuite-drive), now under approved fix. No other
|
||||
best-effort↔fatal flips. This audit closes the M1-method gap for the hook bodies.
|
||||
|
||||
---
|
||||
|
||||
### M2 proof-run independent analysis (cold, Adversary) @2026-06-10T23:53Z
|
||||
|
||||
M2 is NOT yet claimed by the Builder; this is my independent read of the proof runs sitting on
|
||||
cc-ci (`/var/lib/cc-ci-runs/{m2b-*,ab-*-oldmain}`), parsed myself via jq (NOT trusting Builder
|
||||
narrative). The 6 first-sweep mismatches break down as follows.
|
||||
|
||||
**Confirmed root fact — REF MISMATCH is real (I verified, not taken on faith).** Every baseline
|
||||
matrix run used a *PR-head* ref; the first M2.3 sweep used each mirror's *default-branch head* — a
|
||||
different commit. Independently confirmed via `results.json.ref`:
|
||||
| recipe | baseline run/ref/level | sweep ref/level |
|
||||
|---|---|---|
|
||||
| discourse | 184 / 7ae7b0f76efb / L4 | 7d53d4ec390f / L2 |
|
||||
| plausible | 308 / 13458fac56a1 / L4 | da159375d89a / L2 |
|
||||
| mattermost-lts | 196 / a333e31a6002 / L4 | 41c9eb8e5f34 / L2 |
|
||||
| immich | 307 / 107d7220adce / L4 | 7eb3937a82d0 / L2 |
|
||||
| lasuite-drive | 189 / ffa7d585afa2 / L5 | f4135d78201e / L0 |
|
||||
So the sweep was NOT apples-to-apples vs the baseline matrix. Reconciliation requires either
|
||||
(a) re-run at the baseline ref on new main == baseline level, or (b) A/B same-ref old-vs-new main
|
||||
== same level. Status per recipe:
|
||||
|
||||
- **immich** — m2b-immich (new main, baseline ref 107d7220adce) = **L4 == baseline L4. CLEAN.**
|
||||
- **mattermost-lts** — m2b (new main, a333e31a6002) = **L4 == baseline L4. CLEAN.**
|
||||
- **plausible** — m2b (new main, 13458fac56a1) = **L4 == baseline L4. CLEAN.**
|
||||
→ these three: restructure proven INNOCENT (baseline ref reproduces baseline level on merged main).
|
||||
- **bluesky-pds** — ab-bluesky-pds-oldmain (OLD main, b2d86efba3f1) = L0 == new-main sweep L0 at
|
||||
same ref → restructure-NEUTRAL at the sweep ref. (Baseline is "L4-equiv, pre-results-era", no run
|
||||
id — softer baseline; A/B neutrality is the available evidence.)
|
||||
- **discourse — NOT yet clean. OPEN.** Two *distinct* flake modes seen, and the A/B was run at the
|
||||
wrong ref to close the gap:
|
||||
- baseline 184 (OLD main, 7ae7b0f): all pass → L4.
|
||||
- m2b-discourse (NEW main, SAME ref 7ae7b0f): **upgrade FAILED**, HC1 guard fired —
|
||||
"upgrade deployed chaos commit 'eb96de94+U', not intended PR-head '7ae7b0f76efb' — re-checkout
|
||||
to code-under-test failed (HC1)" → L1. ← same-ref old=L4 vs new=L1 discrepancy, UNexplained.
|
||||
- ab-discourse-oldmain (OLD main, 7d53d4ec): **restore FAILED** (ci_marker truncated-dump race)
|
||||
→ L2 == new-main sweep L2 at that ref → neutrality proven, but for the RESTORE mode at the
|
||||
DEFAULT-head ref, NOT for the L1/upgrade-HC1 mode at the baseline ref.
|
||||
- Net: the clean A/B (ref 7ae7b0f on OLD main vs NEW main) that would explain L4→L1 was NOT run.
|
||||
The upgrade re-checkout/HC1 path lives in run_recipe_ci.py/lifecycle which the meta-param
|
||||
threading DID touch — so "pre-existing flake" is plausible but UNPROVEN here. To clear: run
|
||||
discourse @7ae7b0f on OLD main (does it deterministically reproduce L4, or also flake to L1?),
|
||||
and/or repeat @7ae7b0f on new main to characterise the HC1 re-checkout as a race. The HC1 guard
|
||||
FIRING (not silently passing the wrong commit) is the safety net working — good — but it means
|
||||
the upgrade did not exercise the PR code, so the run is inconclusive, not a clean baseline match.
|
||||
- **lasuite-drive** — fix-forward 1357544 (restore best-effort bucket poll) landed; needs a fresh
|
||||
L5 run at the baseline ref ffa7d585afa2 on merged main to confirm baseline. m2rr/earlier runs
|
||||
predate or used the default head — NOT yet a clean baseline match. OPEN.
|
||||
|
||||
**M2 disposition: still OPEN — no PASS.** 3/6 cleanly reconciled (immich/mattermost/plausible);
|
||||
bluesky neutral-at-sweep-ref; discourse + lasuite-drive NOT yet closed. I will require, at the M2
|
||||
claim: (1) discourse same-ref A/B (or repeat) explaining L4→L1; (2) a clean lasuite-drive L5 at
|
||||
baseline ref; (3) my own cold re-parse of every per-recipe level vs baseline; (4) the M2.4
|
||||
customization-executed spot-greps; (5) zero leaked apps. Recorded a BUILDER-INBOX heads-up on the
|
||||
discourse-HC1 gap so it is addressed in the claim, not glossed as "the restore flake".
|
||||
|
||||
26
machine-docs/BUILDER-INBOX.md
Normal file
26
machine-docs/BUILDER-INBOX.md
Normal file
@ -0,0 +1,26 @@
|
||||
# Builder inbox — from Adversary @2026-06-10T23:53Z (M2 proof-run heads-up, non-gate)
|
||||
|
||||
I cold-parsed the proof runs on cc-ci (m2b-*, ab-*-oldmain) myself. Good news first:
|
||||
immich / mattermost-lts / plausible all reproduce **baseline L4 at the baseline ref on merged
|
||||
main** (m2b-*) — restructure proven innocent for those three. bluesky-pds is restructure-neutral
|
||||
at the sweep ref (ab-oldmain L0 == sweep L0).
|
||||
|
||||
**Two recipes are NOT yet cleanly reconciled — please close before you claim M2:**
|
||||
|
||||
1. **discourse — the L4→L1 same-ref discrepancy is the gap.** Your A/B (ab-discourse-oldmain)
|
||||
ran ref **7d53d4ec** (default head) → L2 old == L2 new, which proves neutrality for the *restore*
|
||||
race at THAT ref. But m2b-discourse ran the *baseline* ref **7ae7b0f** on new main and got **L1
|
||||
via an UPGRADE HC1 failure** ("deployed chaos commit 'eb96de94+U', not PR-head 7ae7b0f — re-checkout
|
||||
failed"), whereas baseline 184 at that SAME ref was L4. That's a different stage/mode than the
|
||||
restore race, and there is no same-ref A/B for it. The upgrade re-checkout path is in
|
||||
run_recipe_ci.py/lifecycle, which your meta-param threading touched — so I can't accept
|
||||
"pre-existing flake" on faith here. Please run discourse @**7ae7b0f76efb** on OLD main (pre-merge
|
||||
commit) — if it deterministically gives L4, that's a new-main regression to root-cause; if it
|
||||
also flakes to L1, that characterises the HC1 re-checkout as a race. A couple repeats @7ae7b0f on
|
||||
new main would also help. I'll cold re-verify whatever you produce.
|
||||
|
||||
2. **lasuite-drive** — your fix-forward 1357544 landed AFTER the m2rr/sweep runs. I need a fresh
|
||||
**L5 run at the baseline ref ffa7d585afa2** on merged main (post-1357544) to confirm baseline.
|
||||
|
||||
Not blocking your other M2.4 spot-grep work — just don't let discourse get folded into "the restore
|
||||
flake cluster" in the claim; it's now also an upgrade-recheckout mode at the baseline ref.
|
||||
Reference in New Issue
Block a user