Files
cc-ci/machine-docs/REVIEW-canon.md

52 KiB
Raw Blame History

REVIEW-canon — Adversary verdicts for the canon (canonical-sweep) phase

SSOT for what is being verified: /srv/cc-ci/cc-ci-plan/plan-phase-canon-canonical-sweep.md. Gates: M1 (machinery works locally, each piece proven) and M2 (proven end-to-end in real CI), plus the operator-required samever-orthogonality proof. ## DONE only after fresh PASS on both.


Orientation @ 2026-06-17T06:18Z — Adversary online for canon phase; no gate claimed yet

Prior phase samever is DONE + Adversary-verified (M1 1310a95, M2 199f5b6, no VETO). The canon phase has not been bootstrapped by the Builder yet: no STATUS-canon.md / BACKLOG-canon.md, no claim(/status(canon commits, no inbox. I am idling per liveness protocol and will verify promptly when M1 is CLAIMED (watchdog will ping on the claim).

Independent COLD baseline of the claimed starting state (§1) — captured before any canon work

Verified from my own clone + a cold ssh cc-ci, NOT from the Builder:

  • Enrollment: exactly one recipe sets WARM_CANONICAL = Truecustom-html. (grep -rl 'WARM_CANONICAL *= *True' tests/*/recipe_meta.py → 1 hit.) Matches §1 "only custom-html enrolled".
  • canonical.json records on cc-ci: exactly one, for custom-html: /var/lib/ci-warm/custom-html/canonical.json = {recipe: custom-html, version: 1.13.0+1.31.1, commit: 2b82ebabde74a9d9b1fd4cb49722a7037b18a176, status: idle, ts: 20260617T050314Z}, retained volume warm-custom-html_..._content present.
    • NOTE — plan §1 is now slightly stale. The plan (authored 04:43Z) says "ZERO canonical.json records exist." That was true at authoring, but the just-completed samever M2 e2e (custom-html two-run) wrote this record at 05:03:14Z. So there is now exactly one canonical, produced by samever's promote path. This is favorable evidence for canon M1(A) — the promote path already demonstrably writes a real, reusable record + retains the volume for custom-html — but the Builder must NOT cite custom-html's pre-existing canonical as proof of canon's new work (tagged-gate, trigger, all-enrolled, mirror-sync). I will require fresh, canon-attributable evidence for each M1/M2 sub-claim.
  • Timer: nightly-sweep.timer enabled+active, daily OnCalendar (NEXT 2026-06-18 03:00:24 UTC), last fired 2026-06-17 03:09:20 UTC exit 0. So the timer plumbing works; the job was a near-no-op (only custom-html enrolled). Phase must (F) move this to weekly and (M2) prove a real fire advances canonicals, not exit-0 on an empty set.

What I will adversarially probe when claimed (from the plan, not the Builder's narrative)

  • M1(A): a canon-attributable green cold run writes canonical.json AND --quick warm-reattach reuses it; promote now ALSO requires a release tag — feed an UNTAGGED state, confirm NO promote.
  • M1(C): mirror-sync is faithful upstream sync only — never pushes our changes to mirror main, never disturbs unrelated PRs. Will diff before/after on a mirror.
  • M1(D): trigger keyed on latest release tag vs canonical version, NOT commit — new untagged commits on main with same tag ⇒ SKIP; newer tag ⇒ run cold on that tag.
  • M1(B): all ~21 recipes enrolled; warm-volume disk budget recorded (not silently dropped).
  • M2: full sweep promotes greens / leaves reds intact / skips unchanged; run-twice ⇒ skip-all determinism; real (non-hollow) timer fire; tagged-promote proof (untagged green ⇒ no promote).
  • samever orthogonality: (a) no-new-tag ⇒ SKIPPED; (b) new-tag ⇒ canonical(older)→new, real delta, promote; step-back NEVER fires in the sweep. Construct scenarios if the live set doesn't cover both.
  • §2.G: if plausible's canonical lands at 3.0.1, UPGRADE_BASE_VERSION retired cleanly (key + resolver branch + docs + tests) AND plausible still resolves base 3.0.1 dynamically + passes — else kept with a recorded DECISIONS reason. Will re-derive, not trust.
  • Guardrail: NO AI at runtime (pure script + timer).

Pre-claim code read @ 2026-06-17T06:41Z — M1 still IN PROGRESS (M1.2 not yet committed)

Builder has landed 4 of 5 M1 items (27e0628 M1.1, 136100f M1.3, f8c0e53 M1.4+M1.5). M1.2 (the release-tag trigger sweep_decision + mirror-sync wiring into nightly_sweep.sweep()) is not yet committed — M1 is correctly not-yet-claimed. Read the landed code (NOT JOURNAL); points to scrutinize when claimed:

  • M1.1 (27e0628): should_promote_canonical gained tagged param; caller computes tagged = warm_reconcile.is_released_version(recipe, head_version). ⚠️ PROBE: the gate checks head_version (code under test) but promote_canonical records latest_version(recipe_tags(recipe)) (newest tag). Confirm these can't diverge — e.g. a manual latest run where main sits on a tagged commit OLDER than latest tag would gate on the older tag yet promote the newer. In the sweep path (D) the tag is checked out so head==tag; verify the manual/RECIPE=<r> path too.
  • M1.4 (f8c0e53): root cause = sweep service ran the nix-STORE runner copy (no tests/) so TESTS_DIR missing → enrolled_recipes()=[]. Fix sets CCCI_REPO=/etc/cc-ci + cd + execs $CCCI_REPO/runner/nightly_sweep.py. ⚠️ PROBE at M2: confirm /etc/cc-ci actually exists on cc-ci, has runner/ AND tests/, and is git-pulled before nixos-rebuild (else still hollow). The fix also means sweep-logic ships via checkout pull, NOT a store rebuild — verify deploy procedure pulls it.
  • M1.5 (f8c0e53): OnCalendar daily → Sun *-*-* 03:00:00, Persistent kept. Trivial; verify the deployed timer shows the weekly schedule after M2.1 nixos-rebuild.
  • M1.3 (136100f): enroll all 21 — verify the count is exactly the used-recipes.md set and that fixtures (custom-html-*-bad, concurrency, regression) were NOT enrolled.
  • Still owed for M1 claim: M1.2 sweep_decision(recipe, latest_tag, canon_version) → run|skip:no-new-version|skip:never-released keyed on version_key NOT commit; mirror-sync via open-recipe-pr.sh --reconcile-only (faithful, vendored); cold-run ON THE TAG. Unit tests for all.

M1: PASS @ 2026-06-17T07:12Z — machinery cold-verified (claim 626badd, code @ d4cc9e4)

Verified from a COLD start: my own clone for code/pure-logic, a fresh independent clone on cc-ci (/tmp/adv-canon @ 626badd) for the unit suite, and a cold ssh cc-ci for live state. I did NOT read JOURNAL-canon.md before forming this verdict. Every M1 sub-claim re-derived against the plan, not the Builder's narrative.

M1.1 tagged-promote gate (§2.A) — PASS.

  • Code: should_promote_canonical returns is_enrolled and overall==0 and not quick and not ref and tagged; caller computes tagged = is_released_version(recipe, head_version); promote_canonical now records the TESTED head_version (commit d4cc9e4), not a re-derived latest_version. My prior PROBE (head_version-vs-latest_version divergence on a manual RECIPE=<r> run) is CLOSED by d4cc9e4 — read the diff, it promotes exactly the tested version.
  • Unit: ran tests/unit/test_promote.py myself in the fresh cc-ci clone — all 6 pass, each gate clause individually exercised (test_no_promote_when_untagged asserts tagged=False → False; all-conditions asserts tagged=True → True). Not hollow.
  • Live PROMOTE: re-derived git rev-list -n1 1.13.0+1.31.1 = df2e27339f983a25da548fc8b8d56e9af8645f83 and /var/lib/ci-warm/custom-html/canonical.json records EXACTLY that commit + version 1.13.0+1.31.1, status idle, retained volume warm-custom-html_..._content present. So the promote recorded the tag's own commit (correcting samever's earlier 2b82eba merge-commit record) — the divergence fix is live-proven, not just unit-tested.
  • Live UNTAGGED → NO PROMOTE: independently confirmed 1.13.1+1.31.1 is NOT-A-TAG in the custom-html clone → is_released_version returns False → gate blocks. canonical.json is unchanged (still df2e273). The full live tagged-vs-untagged e2e is M2.4; at M1 the code + unit + live-not-a-tag + unchanged-canonical chain is sufficient.

M1.2 release-tag trigger + faithful mirror-sync (§2.C/§2.D) — PASS.

  • sweep_decision re-derived directly (no pytest) — truth table exactly right and VERSION-keyed, not commit-keyed: new>canon→run; equal→skip no-new-version; older→skip; no tag→skip never-released; no canon→run(seed). The function takes only (latest_tag, canon_version) — it CANNOT see commits, so new untagged commits on main can never trigger a run. That IS the operator's refinement.
  • scripts/recipe-mirror-sync.sh read in full: pins an explicit coopcloud upstream remote, force- syncs mirror main := upstream/main + all tags, pushes NOTHING of our own. PR close is gated on git merge-tree --write-tree NEW_MAIN_SHA <pr-head> == upstream MAIN_TREE (i.e. the PR's merge is a no-op because it's already in upstream) → close; otherwise "left as-is". Faithful, never merges, never disturbs unrelated PRs.
  • nightly_sweep.sweep() wiring read: per enrolled recipe mirror_sync → fetch_recipe → sweep_decision → run_on_tag (checkout the release tag + CCCI_SKIP_FETCH=1 so head IS the tag → tagged-gate passes; REF popped → cold → promote allowed). Pure script.

M1.3 all recipes enrolled (§2.B) — PASS. My grep -rl 'WARM_CANONICAL = True' set is EXACTLY the 21 used-recipes.md rows (incl. uptime-kuma, the lone external row — correctly enrolled for CI/canonical even though excluded from weekly upgrade). Fixtures (custom-html-*-bad, concurrency, regression) NOT enrolled.

M1.4 hollow-sweep fix — PASS (code; live is M2.1). nix/modules/nightly-sweep.nix exports CCCI_REPO=/etc/cc-ci, cds there, and execs $CCCI_REPO/runner/nightly_sweep.py — the checkout WITH tests/, replacing the store copy whose missing tests/ caused enrolled_recipes()=[]. Root cause correctly addressed in code. ⚠️ CARRIED TO M2: /etc/cc-ci is currently STALE — git -C /etc/cc-ci HEAD is e60415d (Phase-3 era), canon code NOT yet there. M2.1 deploy MUST git -C /etc/cc-ci pull before nixos-rebuild, else the deployed timer stays hollow. I will verify the pull + a real fire at M2.5.

M1.5 weekly timer (§2.F) — PASS (code). OnCalendar = "Sun *-*-* 03:00:00", Persistent = true. Deployed-timer schedule verified at M2.

Guardrail NO-AI-at-runtime — PASS. grep of nightly_sweep.py / warm_reconcile.py / recipe-mirror-sync.sh for anthropic|claude|openai|llm|gpt|ai_ → only one code COMMENT match, zero calls. Pure script + systemd timer.

Full unit suite — PASS. Ran cc-ci-run -m pytest tests/unit/ in the fresh independent cc-ci clone @ 626badd295 passed in 5.60s, matching the claim. Enrolling 21 recipes broke nothing.

Minor narrative note (not a defect): the claim cites proof-A ts 065027Z but live canonical ts is 065532Z; promoting the same tag again yields the same version+commit (only ts moves), so this is a benign re-run, not a divergence — the recorded version/commit are correct either way.

Verdict: M1 PASS. No VETO. All M1 DoD items cold-verified; the deployed-state items (M1.4 live, M1.5 timer schedule) are honestly scoped by the Builder to M2 and I will hold them there. (Consulted JOURNAL-canon.md only AFTER writing this verdict: no surprises — confirms the proof-A/C sequence.)


Pre-claim observation @ 2026-06-17T07:23Z — M2.1 deploy verified live (NOT a gate verdict)

Builder inbox: M1 PASS consumed; M2.1 deploy done; M2.2 full sweep started (long, serial, hours). M2 NOT yet claimed — no formal verdict here, just an opportunistic READ-ONLY check that resolves my two carried-to-M2 code-only probes (favorable; I'll still re-verify the live proofs at the M2 claim):

  • /etc/cc-ci now at 3bdd5d1 (current main; was stale e60415d Phase-3 era), with tests/ + runner/nightly_sweep.py present → the deploy DID git -C /etc/cc-ci pull. My M1.4 "deploy must pull or stays hollow" risk is cleared.
  • Deployed timer: systemctl cat nightly-sweep.timerOnCalendar=Sun *-*-* 03:00:00, Persistent=true (weekly, live). M1.5 deployed-schedule probe cleared.
  • Deployed code path is the non-hollow one: the in-flight sweep (PID 1620630) runs nightly_sweep.sweep() from /etc/cc-ci/runner, and run_recipe_ci.py runs from /etc/cc-ci/runner/ — i.e. the checkout WITH tests/, not the store copy. Root cause fixed live. STILL OWED at the M2 claim (I will cold-verify, not trust the sweep log): canonicals actually promoted for greens / reds left intact / no-new-tag skipped (M2.2); run-twice→skip-all (M2.3); live tagged-vs- untagged (M2.4); real timer fire advances canonicals via full main() incl. roll (M2.5); samever never fires in-sweep (M2.6); disk budget recorded (M2.7); §2.G UPGRADE_BASE_VERSION retirement (M2.8). Staying read-only while the sweep is in flight (single node).

Pre-claim finding @ 2026-06-17T08:40Z — M2.2 sweep: PASS-labelled but promotes mostly FAILING (evidence captured)

NOT a verdict (M2 unclaimed). Read-only capture from /root/canon-verify/_sweep.log so the evidence survives log growth. Per-recipe promote outcomes observed (alphabetical sweep, ~7 recipes deep):

  • bluesky-pds: cold rc=0; WC5 promote failed: abra app deploy warm-bluesky-pds… failed (1) → NO canonical; logged PASS (promoted).
  • cryptpad: cold rc=0; canonical cryptpad advanced to known-good 0.6.0+v2026.5.1 → canonical WRITTEN. ✓ (the only real promote so far)
  • custom-html: SKIP no-new-version (pre-existing canonical). ✓ expected.
  • custom-html-tiny: cold rc=0; WC5 promote failed: warm-custom-html-tiny… not healthy over HTTPS / (404) → NO canonical; logged PASS (promoted).
  • discourse: cold rc=142 (deploy timeout — the 51m wedge I flagged) → FAIL (canonical unchanged). Legit red.
  • drone: cold rc=0; WC5 promote failed: …warm-drone… timed out after 600 seconds → NO canonical; logged PASS (promoted).
  • ghost: cold rc=0; WC5 promote failed: abra app new ghost… failed (1) → NO canonical; logged PASS (promoted).
  • gitea: promote in progress at capture. Live /var/lib/ci-warm/*/canonical.json = {cryptpad, custom-html} only. NET NEW this sweep = 1 (cryptpad). Leftover warm volumes w/ NO registry record: drone, gitea, custom-html-tiny (partial-promote residue).

DEFECT-1 [adversary] (results-label): nightly_sweep.sweep() line ~119 sets results[r] = "PASS (promoted)" if rc==0 else "FAIL …". Because promote_canonical is non-fatal (swallows its own exception so it "never fails a green run"), a FAILED promote still yields rc=0 → the summary asserts "PASS (promoted)" when NO canonical was written. The per-recipe results log — the DoD's evidence that "canonicals actually promoted for the green recipes" — is therefore UNTRUSTWORTHY. Repro: grep "WC5 promote failed" _sweep.log vs grep "PASS (promoted)" _sweep.log — failed promotes appear in BOTH. Fix direction: label from "does a canonical record now exist at the tested version", not from rc.

DEFECT-2 [adversary] (promote path failing broadly): 4 of 5 completed promotes FAILED across 4 modes (warm app deploy failed(1) / timed-out 600s / unhealthy-404 / app new failed(1)). Cold CI is green for each, so this is specifically the WARM-CANONICAL promote deploy failing — the exact end-to-end step this phase exists to make real. Root cause TBD (node contention on the long serial run / unclean cold-test teardown / discourse residue / flat 600s warm timeout) — Builder's to diagnose.

Determinism risk (M2.3): every recipe left without a canonical (bluesky-pds, custom-html-tiny, drone, ghost, discourse…) will sweep_decision(latest, None) → run on a second sweep, NOT skip — so run-twice ≠ skip-all until promotes actually succeed. I will hard-test this at the M2 claim.

Sent the Builder a BUILDER-INBOX heads-up (ba28a88). When M2 is claimed I will cold-verify, per recipe, that a canonical record exists at the tested tag version (not trust the PASS label), and re-run the determinism no-op myself. If promotes are still failing / mislabelled, M2 FAILs.

Pre-claim note @ 2026-06-17T09:11Z — fix f94de22 validated by Builder; M2 re-run in flight (NOT a verdict)

Consumed ADVERSARY-INBOX (Builder ~09:10Z): DEFECT-1/DEFECT-2 fix validated live — custom-html-tiny PROMOTED (1.2.0+2.43.0, was 404) and ghost PROMOTED (1.4.0+6.45.0-alpine, was app-new dirty-tree FATA); label now derives from "canonical record exists at tested version". 7 canonicals claimed (cryptpad, custom-html, custom-html-tiny, ghost, gitea, hedgedoc, immich). Full sweep re-run in flight. M2 unclaimed. Staying read-only off the node (sweep in flight, single node).

bluesky-pds "documented RED" — must scrutinise at M2 claim, two ways it could be wrong:

  1. The conservative direction is CORRECT per guardrail (no force-promote; prior known-good kept). But I must confirm bluesky has NO stale/partial canonical written, and that it is recorded as an exception in DECISIONS (plan §2.B: "don't silently skip" / §4 "documented exception"), not just left silent.
  2. The real risk: Builder says warm health fails because traefik doesn't route the WARM domain (warm-bluesky-pds… → 000) though internal localhost:3000 = 200, and "cold domain worked." I must verify this is genuinely bluesky-SPECIFIC and not a warm-canonical-deploy machinery defect (warm domain label/overlay/router rule) that could equally hit other recipes — if the warm-domain routing is systemically flaky, a recipe could intermittently fail to promote (or, worse, a health probe could pass spuriously). At claim I will: (a) confirm OTHER promoted recipes (custom-html-tiny, ghost, immich) actually answered 200 over HTTPS on THEIR warm domains during promote (grep ready-probe lines), and (b) independently curl a couple of the live warm canonical domains. If warm-domain routing is broadly unreliable, the promote evidence is suspect and M2 is not done.

Pre-claim observation @ 2026-06-17T09:34Z — read-only sweep-progress peek (NOT a verdict)

Sweep re-run still in flight (proc 1712141 from /etc/cc-ci/runner); 7 canonicals on disk. Captured from _sweep.log so it survives log growth:

  • DEFECT-1 fix is LIVE and honest: sweep: bluesky-pds rc=0 (GREEN-BUT-PROMOTE-FAILED (canonical=none, expected 0.3.0+v0.4.219)) — the label no longer claims PASS (promoted) on a failed promote. Favorable; I will still confirm the label matches the on-disk registry per recipe at claim before closing DEFECT-1.
  • cryptpad / custom-html / custom-html-tinySKIP no-new-version (latest tag == canonical). The skip path works for promoted recipes.
  • discourse rc=143 → FAIL (red; canonical unchanged) — legit red (timeout/SIGTERM), canonical kept.
  • NEW — sweep: mirror-sync drone rc=128 (non-fatal — continuing): drone's faithful mirror-sync FAILED (git rc=128) yet the sweep proceeded to RUN drone against the un-synced mirror. SCRUTINISE at claim: plan §2.C requires the mirror be reconciled to upstream FIRST; a swallowed sync failure means the recipe may be tested against a stale mirror (wrong tags/version) — the trigger (D) and tagged promote then rest on un-synced state. Is rc=128 a benign "already up to date / no upstream" case or a real sync failure? Must check what drone's sync hit and whether the tested tag is genuinely upstream's.
  • DETERMINISM (M2.3) — central risk crystallising: bluesky-pds (promote-failed) and discourse (red) both end canonical=none, so a 2nd sweep → sweep_decision(latest, None) → RUN, NOT skip. Plan M2.3 literally requires run-twice → "SKIPS every recipe." That can hold ONLY if every enrolled recipe actually promoted. Red/promote-failed recipes legitimately re-run (no known-good to protect) — which is arguably correct behaviour but is NOT "skip every recipe." At the M2 claim I will require the Builder's determinism evidence to honestly reconcile this with §3/§5: either (i) every recipe promotes so run-twice is a true no-op, or (ii) a reasoned, plan-consistent argument that the no-op property applies to the promoted set and red recipes correctly retry — and I'll judge it against the plan, not accept a partial skip-all relabelled as success.

Pre-claim observation @ 2026-06-17T10:20Z — TWO concurrent sweeps (transient process state, captured)

Read-only ps on cc-ci caught a non-serial condition while M2 is mid-development (NOT a verdict; M2 unclaimed):

  • PID 1712141 = OLD sweep (started 09:10:40, code f94de22) — WEDGED: child PID 1720589 (run_recipe_ci.py, started 09:33:58, alive ~46 min) is the drone cold-dep self-deadlock the lock-release fix (655a999) addresses. The old sweep process is still ALIVE, holding cold-test locks.
  • PID 1736506 = NEW sweep (started 10:16:27, code 655a999), already cold-testing recipe 1. So at 10:20Z two nightly_sweep.sweep() ran simultaneously. This violates §4 SERIAL and, more pointedly, invalidates the documented precondition of release_app_locks() ("serial sweep → no concurrent run relies on these locks") — the wedged old run still holds drone/gitea locks, so the two can collide. Any M2 promote/determinism/log evidence from a sweep that overlapped the wedged one is non-serial and I will not accept it. Canonical count is 8 (drone now promoted → lock-release fix works), so the fix itself is good; the issue is the leftover concurrent process. Sent BUILDER-INBOX asking the Builder to kill the wedged old sweep, confirm a clean single serial run, and regenerate M2 evidence. SCRUTINY CARRIED TO CLAIM: confirm the claimed M2 sweep ran with exactly ONE sweep process and no overlap (check run start time vs old-sweep kill time); and verify release_app_locks() cannot free a lock still guarding a live app under any interleaving the in-flight guard permits.

Update @ 10:24Z: Builder consumed the alert and acted correctly — SIGKILLed both sweeps + the wedged drone child, cleared stale /run/lock/cc-ci-app-*.lock, confirmed no leftover warm-*/dep stacks, discarded drone's concurrency-tainted canonical (promoted by a standalone validation at 10:06:45 that overlapped the wedged old sweep), kept the 7 single-run canonicals, and relaunched ONE clean serial sweep (pid 1741209, code 655a999) as the M2.2 evidence run. Concurrency window was ~10:0610:24 (old sweep 1712141 alive 09:10→killed 10:24). CARRIED TO CLAIM: independently confirm each of the 7 kept canonicals (cryptpad, custom-html, custom-html-tiny, ghost, gitea, hedgedoc, immich) has a ts OUTSIDE the concurrency window and was produced single-run — do NOT take the Builder's accounting on faith; check canonical.json ts per recipe vs the 09:1010:24 overlap. And confirm the claimed sweep (1741209) ran start→finish with no second sweep process alive.

Pre-claim observation @ 2026-06-17T10:47Z — clean serial sweep progress (NOT a verdict)

ONE sweep proc confirmed (serial intact). Transient _sweep.log lines captured before rotation:

  • CONCERN — drone rc=0 GREEN-BUT-PROMOTE-FAILED (canonical=none, expected 1.9.0+2.26.0) in the CLEAN serial run. Drone promoted under the discarded tainted validation but FAILS to promote clean-serial — and it no longer hangs (returns cleanly), so the lock-release fix (655a999) cured the 46-min deadlock but drone's warm promote still fails for a DIFFERENT reason (likely warm gitea-dep provisioning or warm deploy/health). Net: the lock fix is necessary-but-not-sufficient for drone; drone will lack a canonical → hits both promote-evidence and determinism (run-twice) at the claim. Builder will see it in their own running log; their diagnose. I'll require drone to either promote clean or be a recorded DECISIONS exception (like bluesky) at claim — a silent no-canonical is not OK.
  • FAVORABLE — gitea RUN — new release 3.6.0+1.24.2-rootless > canonical 3.5.3+1.24.2-rootless; cold-testing tagged release 3.6.0… — a LIVE instance of the new-release-tag trigger advancing an existing canonical (older→newer TAGGED), i.e. exactly the M2.6 samever-orthogonality path (2): canonical(older)→new tagged, real delta, promote-if-green. If gitea promotes to 3.6.0 this is strong M2.6 evidence (no constructed scenario needed). VERIFY AT CLAIM: gitea's canonical advances 3.5.3→3.6.0 with the new tag's own commit, and samever's same-version step-back NEVER fired in the run (the tag trigger guarantees vX→vY, Y>X, so no vX→vX). Watch that gitea actually promotes (not GREEN-BUT-FAILED).
  • SKIPs (cryptpad/custom-html/custom-html-tiny/ghost = no-new-version) and discourse rc=143 red: consistent with prior runs.

Pre-claim note @ 2026-06-17T10:59Z — two more Builder fixes; M2-evidence-sweep recency criterion

Builder landed ca89d44 (promote clears stale warm-stack on FRESH SEED only — fixes the failed-promote secret residue, e.g. drone's gitea client_secret_v1 blocking abra app secret insert on retry; correctly does NOT teardown when a canonical exists → retained volume safe) and d072d7e (de-enroll keycloak — structural collision with the live-warm OIDC provider on warm-keycloak.ci...; thorough DECISIONS entry; enrolled now 20 + 1 documented exception). Both reasonable. The residue fix is the likely root cause of the clean-serial drone promote-fail I flagged. M2-EVIDENCE RECENCY CRITERION (new, checkable): the in-flight sweep pid 1741209 launched ~10:16 — BEFORE ca89d44 (10:51) and d072d7e (10:54) — so its parent-process enrolled set still includes keycloak and its sweep logic predates the residue fix (only per-recipe run_recipe_ci.py picks up new code if /etc/cc-ci is pulled mid-run; nightly_sweep.sweep()'s enrolled list + decisioning is fixed at launch). Therefore the authoritative M2.2 sweep I accept MUST be one launched with /etc/cc-ci at a HEAD that contains BOTH fixes, enrolled=20 (keycloak absent), single serial proc. At claim: check the evidence sweep's launch time vs these commit times, and confirm drone now PROMOTES (residue fix) or is a recorded exception. Also verify ca89d44's fresh-seed teardown can't nuke a shared/retained volume (guarded by if not read_registry(recipe) — only when no canonical exists, so nothing known-good to lose; confirm).

Pre-claim verification @ 2026-06-17T11:12Z — fresh-seed-teardown × live-keycloak footgun: MITIGATED

Identified a real footgun in ca89d44: the fresh-seed branch does teardown_app(canonical_domain(recipe)) for any enrolled recipe lacking a canonical. For keycloak, canonical_domain == the LIVE shared OIDC provider domain warm-keycloak.ci... — so a fresh-seed keycloak promote would have TORN DOWN the live provider that lasuite-*/drone depend on. The de-enroll (d072d7e) is precisely what prevents this. INDEPENDENTLY VERIFIED (read-only, my own checks, not Builder's word):

  • At HEAD: tests/keycloak/recipe_meta.pyWARM_CANONICAL = False; canonical.enrolled_recipes() = 20, keycloak NOT in set → the post-fix sweep never runs the fresh-seed teardown against keycloak.
  • Live https://warm-keycloak.ci.commoninternet.net/realms/master200; services warm-keycloak_..._app + _db both 1/1 → the pre-fix sweep 1741209's keycloak promote attempt (old promote, no teardown) did NOT disrupt the live provider. Healthy. Conclusion: footgun is structurally mitigated AND live-confirmed unharmed — favorable. STILL CARRY TO CLAIM: confirm NO OTHER enrolled recipe's canonical_domain collides with a live/shared service (so the fresh-seed teardown only ever hits a disposable warm- stack), and that the final sweep's keycloak absence holds at the sweep's launch HEAD.

Pre-claim observation @ 2026-06-17T11:23Z — pre-fix sweep FINISHED (0 procs); 15 canonicals

Final tail of the pre-fix serial sweep (1741209): n8n PASS(3.4.0+2.23.2), plausible PASS(3.1.0+v2.0.0), uptime-kuma PASS(3.1.0+2.4.0); mumble rc=1 FAIL (red; canonical unchanged). Canonical count = 15. Two new claim-scrutiny points:

  • mumble — NEW red (rc=1, not a timeout), not previously documented. Before M2 it must be either fixed (promotes clean) or recorded as a DECISIONS exception with a reason — a silent no-canonical is not acceptable (same bar I'm holding bluesky/discourse/drone to). Watch for the diagnosis.
  • plausible promoted at 3.1.0+v2.0.0, NOT the 3.0.1 the plan §2.G anticipated. The §2.8 UPGRADE_BASE_VERSION retirement reasoning ("canonical at 3.0.1 → dynamic base resolves 3.0.1 → pin redundant, drop the broken 3.0.0") must be RE-DERIVED against the actual canonical 3.1.0+v2.0.0: at claim verify that with plausible's real canonical, the dynamic upgrade base resolves to a correct green release (NOT the broken 3.0.0 clickhouse-404 base) and plausible's upgrade tier passes — only then is dropping the pin safe. If not, the pin stays with a recorded reason (§2.G GATE). Builder's plan next: deploy fixes to /etc/cc-ci, re-promote drone (fresh-seed fix) + retry gitea 3.6.0, then launch the FINAL authoritative sweep = the M2.2 evidence (postdates ca89d44+d072d7e, enrolled=20).

Pre-claim @ 2026-06-17T11:35Z — FINAL authoritative sweep launched; recency criterion MET (confirmed)

Builder launched the authoritative M2.2 sweep (pid 1960362, ~11:26Z) from /etc/cc-ci @ 12acf94. I INDEPENDENTLY confirmed git merge-base --is-ancestor: ca89d44 (residue) AND d072d7e (keycloak) are both ancestors of 12acf94 → the evidence sweep postdates both fixes, enrolled=20, single serial. My M2-evidence recency criterion is satisfied — this run is the legitimate M2.2 evidence. (Still verify at claim: it ran start→finish with no second sweep proc.)

Red diagnoses to verify at claim (Builder posture = "red test is information, never weakened" — correct):

  • discourse: upstream 0.8.1 compose invalid (sidekiq → undefined service discourse). VERIFY: it's a genuine upstream defect (re-read the compose), not our overlay; canonical unchanged.
  • mattermost-lts: test_restore.py::test_restore_returns_state FAILED at latest. VERIFY: the test is unmodified (git-blame the test vs main; not weakened/xfail'd to dodge), failure is real.
  • mumble: custom/test_protocol_handshake.py::test_handshake_completes_with_channel_presence FAILED. VERIFY: test unmodified, real failure.
  • bluesky-pds: cold green, warm-promote health 000 (traefik doesn't route warm domain; PDS 200 on localhost:3000). VERIFY recipe-specific (not machinery): confirm other promoted recipes DID answer 200 over HTTPS on their warm domains (already favorable — 15 promoted healthy). ALL FOUR must be recorded as DECISIONS exceptions with reasons (not silent no-canonicals) before M2. Expected from this sweep: ~14 SKIP (determinism), drone PROMOTES (residue fix), gitea 3.5.3→3.6.0 advance.

Pre-claim findings @ 2026-06-17T11:58Z — final sweep crux outcomes (drone ✓, gitea advance ✗)

Cold-read from cc-ci (raw canonical.json, my own check). 16 canonical recipes on disk: cryptpad, custom-html, custom-html-tiny, drone, ghost, gitea, hedgedoc, immich, lasuite-{docs,drive,meet}, mailu, matrix-synapse, n8n, plausible, uptime-kuma. 16 promoted + 4 documented reds (discourse, mattermost-lts, mumble, bluesky-pds) = 20 enrolled. Clean accounting.

  • drone — PROMOTED CLEAN ✓ (favorable, DEFECT-2 closing evidence). /var/lib/ci-warm/drone/ canonical.json = {version 1.9.0+2.26.0, commit 91b27ceb…, status idle, ts 20260617T115046Z} — fresh, from THIS final post-fix sweep; log sweep: drone rc=0 (PASS (promoted 1.9.0+2.26.0)). The fresh-seed-teardown residue fix (ca89d44) resolved the once-failed-promote secret residue. (At the formal claim I'll re-derive that commit == the 1.9.0+2.26.0 tag's commit, and confirm warm reattach.)
  • gitea — ADVANCE FAILED AGAIN ✗ (CLAIM-BLOCKER for M2.6 + M2.3). Log: sweep: gitea RUN — new release 3.6.0+1.24.2-rootless > canonical 3.5.3+1.24.2-rootless … rc=0 (GREEN-BUT-PROMOTE-FAILED (canonical=3.5.3…, expected 3.6.0…)). canonical.json still 3.5.3+1.24.2-rootless (ts 083930Z, OLD) — known-good correctly PRESERVED on the failed advance, but the advance did NOT happen. Impact:
    1. M2.6 not demonstrated: gitea was the live new-tag→canonical(older)→new advance proof. The trigger fired (RUN on the newer tag) and old-known-good was kept, but a SUCCESSFUL promote to the new tagged version — which §3/§5 M2.6 requires — did not occur. Needs a real fix or the plan's alternative (construct custom-html older→new).
    2. M2.3 determinism dirtied: on a 2nd sweep sweep_decision(gitea, 3.6.0, 3.5.3) → RUN, so gitea re-runs — and it is NOT a genuine red (cold test is GREEN; only the warm advance promote times out ~600s). So it is NOT covered by "reds correctly retry"; it is a green recipe whose promote deterministically fails, which both wastes a CI rerun AND breaks "run-twice → skip-all". A plain retry won't fix a deterministic timeout — needs the warm-advance timeout raised / the in-place version-bump deploy diagnosed, OR gitea documented like the reds (but it's green, so that's weaker). Sending the Builder a heads-up so they don't claim M2 with this open.

Sweep completion @ 12:00:03Z: authoritative sweep === M2.2 FULL SWEEP done rc=0 2026-06-17T12:00:03Z === (ran 11:25:57→12:00:03, ~34m; node idle after, no sweep/run procs). Determinism preview already visible IN this run: n8n/plausible/uptime-kuma/immich/lasuite-*/mailu/matrix-synapse all SKIP no-new-version = the just-promoted recipes correctly skip. Builder consumed my gitea heads-up (9303359: "gitea 3.6.0 advance — fixing; drone promoted clean"). Awaiting gitea fix + M2.3/M2.5/M2.6/ M2.7/M2.8 proofs before any M2 claim.

Pre-claim assessment @ 2026-06-17T12:21Z — gitea-exception diagnosis + M2.3 reframing (my acceptance bar)

Builder landed bdc2ec4 (DECISIONS): gitea 3.6.0 warm-advance documented as a RECIPE issue + an M2.3 determinism reframing. My standard for accepting these at the M2 claim:

gitea 3.6.0 exception — diagnosis plausible; two things I will independently verify (not take on faith):

  • Builder's isolation claim is the right shape: the warm-ADVANCE machinery is proven via a CONSTRUCTED custom-html older→new advance (M2.6), so gitea's failure is gitea-specific not machinery. VERIFY the custom-html advance ACTUALLY promoted (canonical advanced old→new, healthy) — that's load-bearing.
  • The gitea crash is JWT Secret … app.ini: read-only file system. Cold FRESH 3.6.0 passes; warm reattach-advance crashes. VERIFY this is genuinely a gitea-3.6.0/rootless-config + retained-volume interaction (e.g. pre-existing 3.5.3 app.ini / rootless-UID), NOT our warm-promote mounting app.ini read-only. If OUR machinery makes app.ini read-only (cold doesn't, warm does), it's a MACHINERY defect mislabeled as a recipe issue — that would NOT be an acceptable exception and would fail M1(A)/M2. Check: how does the warm advance mount/derive app.ini vs the cold install for gitea.
  • gitea correctly KEEPS 3.5.3 (never promote unhealthy) — good; confirm 3.5.3 record + volume intact.

M2.3 reframing — ACCEPTABLE ONLY IF rigorously demonstrated + flagged as a DoD deviation. Plan §3/§5 LITERALLY say run-twice → "SKIPS every recipe … clean no-op". That ideal assumed all-promote; reality = 15 promoted-at-latest + 5 that can't (4 genuine/documented reds + gitea recipe-bug). Builder's operative property = "no promoted-at-latest recipe re-runs; reds + gitea correctly retry." This is plan-consistent in SPIRIT (the no-op's purpose is no needless re-test of good-current recipes) and the plan forbids weakening tests to force promotes — so the literal ideal is unachievable honestly. I will ACCEPT it IFF: (i) an actual immediate 2nd sweep shows EXACTLY the 15 promoted-at-latest SKIP (no CI rerun) and ONLY the documented exceptions (gitea + 4 reds) RUN — I will re-run/inspect this myself, not trust a summary; (ii) every re-running recipe has a recorded DECISIONS reason; (iii) it is explicitly noted as a deviation from the literal "skip every recipe" so the operator sees it. If a promoted-at- latest recipe needlessly re-runs, or an undocumented recipe re-runs, M2.3 FAILs. NOT a veto now — this is the bar I'll hold at the claim.

Pre-claim pre-verification @ 2026-06-17T12:34Z — §2.G strip (M2.8) favorable; M2.5 bash-fix needs redeploy

  • §2.G UPGRADE_BASE_VERSION retirement (f611dda, 83c183d) — code-level strip CONFIRMED complete. grep -rn UPGRADE_BASE_VERSION (excl. machine-docs) → only EXPLANATORY comments/docs remain (testing.md, plausible/bluesky-pds/discourse meta comments, test_meta + test_upgrade_base comments, the resolver removal comment at run_recipe_ci.py:132) — NO live key/branch. plausible's pin gone (meta comment: dynamic base STEPS BACK to newest-published-strictly-older-than-3.1.0 = 3.0.1+v2.0.0 = the correct base, avoiding broken 3.0.0); meta KEYS 15→14 (test_meta.py); bluesky-pds comment now points to dynamic base. AT CLAIM: run the full unit suite (test_meta/test_upgrade_base green post-strip) + confirm plausible's UPGRADE tier actually resolves base 3.0.1+v2.0.0 dynamically AND passes (Builder claims "verified dynamic-base green" — re-run it myself). §2.G GATE (keep-if-broken) does NOT apply since plausible works.
  • M2.5 real timer fire — IN PROGRESS, caught a real bug. cebd293: the actual timer fire revealed the deployed nightly-sweep service was MISSING bash in nix runtimeInputs (a manual run wouldn't catch it — exactly why "real fire, not manual" is the DoD). Fix adds bash. NOTE: this is a nix module change → requires git -C /etc/cc-ci pull + nixos-rebuild switch to deploy, THEN a fresh real timer fire that ADVANCES ≥1 canonical (non-hollow). AT CLAIM: confirm the fix is deployed AND a post-fix real fire (systemctl start nightly-sweep.service or the timer) ran the non-hollow job to completion with evidence (a canonical ts moved / log shows the 20-recipe sweep), not exit-0 on empty.

Pre-claim @ 2026-06-17T13:09Z — DEFECT-3 fix (env parity) landed; assessment + verify-at-claim

Builder consumed DEFECT-3 and fixed it (2c61f2f): nightly-sweep.nix now prepends the host system PATH /run/current-system/sw/bin:/run/wrappers/bin so the timer sweep runs recipes in the SAME env as Drone's exec runner — one change for git-lfs/bash/openssl/etc. parity (vs enumerating runtimeInputs). Right fix in principle (the sweep SHOULD validate exactly as Drone CI does). nix module change → needs nixos-rebuild + a fresh real timer fire = the production-env M2.2/M2.5 evidence. DEFECT-3 stays OPEN until that re-fire. Verify at claim:

  • PARITY IS REAL not asserted: ssh cc-ci 'ls /run/current-system/sw/bin/git-lfs; systemctl cat drone-runner-exec* | grep -i PATH' — git-lfs present there AND Drone actually uses that PATH.
  • Re-fire flips gitea back to COLD-GREEN (custom/lfs passes) then hits the documented app.ini warm-advance exception (rc=0 GREEN-BUT-PROMOTE-FAILED) — restoring "cold green, advance-only" IN production, validating that exception framing. If gitea still reds at custom, parity isn't achieved.
  • Re-fire re-validates the promoted set under production env: the 15 promoted-at-latest SKIP, custom-html (now advanced to 1.13.0) SKIPs, 4 reds red, no NEW promote failures surface that the manual env hid.
  • Determinism unaffected: host system PATH is stable per nixos generation; matches Drone → correct comparison, not a non-determinism source. Favorable already-demonstrated (this fire): custom-html 1.11.0→1.13.0 advance PASS = constructed M2.6 older→new advance + a real non-hollow timer promotion. M2 still correctly UNCLAIMED.

Pre-claim observation @ 2026-06-17T14:30Z — DEFECT-3 parity REAL + live timer re-fire re-validating (NOT a verdict)

A POST-parity-fix real timer fire is in flight: nightly-sweep.service active since 13:01:01 UTC (Invocation b184fde4…, PID 2149231), single serial proc (no second sweep/run_recipe_ci on cc-ci). Captured from journalctl (production env, survives log rotation) + read-only config checks. This is the DEFECT-3 re-validation run I said the defect stays OPEN until. Cold checks, my own, not the Builder's word:

  • PARITY IS REAL (my verify-at-claim criterion #1 — MET). nightly-sweep ExecStart wrapper line 17: export PATH="/run/current-system/sw/bin:/run/wrappers/bin:$PATH" — host system PATH prepended, byte-for-byte matching Drone's drone-runner-exec.service Environment="PATH=/run/current-system/ sw/bin:/run/wrappers/bin". git-lfs present at /run/current-system/sw/bin/git-lfs → git-lfs-3.6.1. /etc/cc-ci HEAD = 2c61f2f (parity fix is the deployed runner code; merge-base --is-ancestor ✓). So parity is structural + deployed, not asserted.
  • gitea flips COLD-GREEN under production env (criterion #2 — MET behaviorally). In THIS timer fire: tests/gitea/custom/test_lfs_roundtrip.py::test_lfs_roundtrip PASSED (the exact test DEFECT-3 reded on the missing-git-lfs fire). gitea then RUN — new release 3.6.0 > canonical 3.5.3 and is processing the advance now — expected to land on the documented app.ini warm-advance exception (GREEN-BUT-PROMOTE-FAILED), i.e. "cold green, advance-only-fails," restoring the documented framing in production. DEFECT-3 git-lfs gap is behaviorally closed in the production timer env.
  • Promoted set re-validates under production env (criterion #3 — favorable so far): custom-html RUN — new release 1.13.0 > canonical 1.11.0 → PASS (promoted 1.13.0+1.31.1) (a REAL non-hollow timer promote/advance); and the promoted-at-latest recipes SKIP no-new-version (cryptpad, custom-html-tiny, drone[1.9.0], ghost, immich, lasuite-{docs,drive,meet}, mailu, matrix-synapse, n8n, plausible, uptime-kuma) — live determinism preview INSIDE the production fire. Reds so far: discourse rc=142 (timeout), mattermost-lts rc=1, mumble rc=1, bluesky-pds GREEN-BUT-PROMOTE-FAILED — all the documented exceptions, no NEW promote failures the manual env hid.
  • Determinism source check (criterion #4 — MET): host system PATH is fixed per nixos generation and equals Drone's → a stable, correct comparison env, not a non-determinism vector.

This is strongly favorable toward closing DEFECT-3 and the production-env M2.2/M2.5 evidence, BUT M2 is still correctly UNCLAIMED and the fire is mid-gitea (not finished). I will NOT close DEFECT-3 or accept M2 until: (a) this fire completes start→finish single-serial with the final per-recipe summary; (b) I re-derive each promoted canonical's commit==tag-commit and a warm reattach; (c) the gitea app.ini exception, discourse/mattermost/mumble reds, and bluesky warm-routing exception are all recorded in DECISIONS (not silent no-canonicals); (d) the formal M2 claim arrives in STATUS with WHAT/HOW/EXPECTED. Staying read-only off the node while the sweep is in flight (single node).

Update @ 2026-06-17T14:39Z — production-env timer fire COMPLETED cleanly (still NOT a verdict). nightly-sweep.service finished 14:37:22 UTC, Result=success, ExecMainStatus=0, single serial (no leftover sweep/run_recipe_ci procs). Final per-recipe summary (journalctl, my own read):

  • custom-html: PASS (promoted 1.13.0+1.31.1) — a REAL non-hollow timer advance 1.11.0→1.13.0 in production env (M2.5 real-fire + M2.6 constructed older→new advance, both in one live timer fire).
  • 14 SKIP no-new-version (cryptpad, custom-html-tiny, drone, ghost, hedgedoc, immich, lasuite-{docs, drive,meet}, mailu, matrix-synapse, n8n, plausible, uptime-kuma) — live determinism: promoted-at-latest recipes correctly no-op in the production fire.
  • 6 documented exceptions: gitea GREEN-BUT-PROMOTE-FAILED (cold-green via lfs PASS; app.ini warm-advance exception, 3.5.3 kept); bluesky-pds GREEN-BUT-PROMOTE-FAILED (warm-routing); discourse/mattermost-lts/ mumble red (canonical unchanged). No NEW promote failures the manual env masked. This resolves the "won't close DEFECT-3 until the fire completes" condition: the fire DID complete cleanly under real Drone-parity env. I am NOT yet closing DEFECT-3 or accepting M2 — that happens at the formal M2 claim, where I will cold re-derive each promoted canonical's commit==tag-commit + a warm reattach, confirm all 6 exceptions are recorded in DECISIONS, and re-run/inspect determinism myself. DEFECT-3 stays OPEN (narrowly: pending the claim-time confirmation), but its production re-validation is now favorable.

M2: PASS @ 2026-06-17T16:14Z — canonical sweep proven end-to-end (claim a4f1df4; DEFECT-3 CLOSED)

Verified from a COLD start: fresh independent clone on cc-ci (/tmp/adv-m2 @ deployed HEAD 2c61f2f), cold ssh cc-ci for live state/journald, and my OWN re-runs (unit suite, resolver calls, a live --quick warm reattach). I did NOT read JOURNAL-canon.md before this verdict. Every M2 sub-claim and every carried scrutiny point re-derived against the plan + observable behaviour, not the Builder's word.

M2.1 deploy + DEFECT-3 parity — PASS. Deployed /etc/cc-ci HEAD 2c61f2f (parity fix) is current — git diff --stat 2c61f2f origin/main -- runner/ tests/ nix/ scripts/ is EMPTY (the gap to Builder HEAD 009bc60 is docs/status only, no undeployed code). nightly-sweep ExecStart wrapper line 17 export PATH="/run/current-system/sw/bin:/run/wrappers/bin:$PATH" BYTE-MATCHES drone-runner-exec.service Environment="PATH=/run/current-system/sw/bin:/run/wrappers/bin"; git-lfs present at /run/current-system/sw/bin/git-lfs. Weekly timer OnCalendar=Sun *-*-* 03:00:00, Persistent. DEFECT-3 CLOSED: behaviorally proven in the production timer fire — tests/gitea/custom/test_lfs_roundtrip.py:: test_lfs_roundtrip PASSED (the exact test that reded on the missing-git-lfs fire); gitea flips cold-green under the real Drone-parity env.

M2.2 + M2.5 real (non-hollow) timer fire — PASS. nightly-sweep.service fired by real systemd: active 13:01:01Z → completed 14:37:22Z, Result=success, ExecMainStatus=0, single serial (no 2nd sweep/ run_recipe_ci proc — confirmed across my polls). Non-hollow: enrolled=20, ADVANCED custom-html 1.11.0→ 1.13.0 (the prior hollow timer logged enrolled canonicals=[]). All 16 canonicals re-derived: every canonical.json commit == the tested release tag's commit (git -C ~/.abra/recipes/<r> rev-list -n1 <version> == recorded commit) — cryptpad, custom-html(1.13.0+1.31.1/df2e273), custom-html-tiny, drone, ghost, gitea(3.5.3, known-good kept), hedgedoc, immich, lasuite-{docs,drive,meet}, mailu, matrix-synapse, n8n, plausible(3.1.0+v2.0.0/13458fac), uptime-kuma — all OK, no arbitrary-commit canonical. Timestamps 07:22→13:15Z; none fall in the 09:1010:24Z concurrency window I flagged (drone correctly re-promoted 11:50, the tainted 10:06 one discarded). Reds left intact (discourse/mattermost-lts/mumble no canonical; bluesky no canonical; gitea kept 3.5.3) — never force-promoted.

M2.3 determinism (run-twice) — PASS (operative no-op). The clean serial 2nd sweep launched 14:41:16Z (AFTER the 1st fire ended 14:37:22Z → NO overlap; single serial throughout my polls), enrolled=20. Final partition I read from journald myself: exactly 15 promoted-at-latest → SKIP no-new-version (incl. custom-html 1.13.0, just advanced → now skips = the central determinism proof) and 5 → RUN, every one a documented exception (gitea retries 3.6.0 advance; bluesky/discourse/mattermost-lts/mumble lack a known-good). My acceptance bar (set 12:21Z) is MET: (i) only the 15 promoted-at-latest skip and only documented exceptions run — verified, not trusted; (ii) every re-running recipe has a DECISIONS reason; (iii) DECISIONS explicitly flags this as a deviation from the literal "skip every recipe" ("'Skip every recipe' is the all-promoted ideal; the demonstrated property is 'no promoted-at-latest recipe re-runs'"). Plan-consistent (the plan forbids weakening a test to force a promote).

M2.4 tagged-promote gate — PASS. Untagged green ⇒ NO promote (proof-C + test_no_promote_when_untagged in the now-294-pass unit suite I re-ran); tagged green ⇒ promote (all 16 canonicals commit==tag, live in the production fire). Gate proven both ways.

M2.6 samever orthogonality — PASS. Path-2 (new tag → older→new promote): custom-html advanced 1.11.0→ 1.13.0 in the live production timer fire AND promoted healthy; gitea fired the trigger (RUN on 3.6.0>3.5.3). Path-1 (no new tag → SKIP): the 15 SKIP-no-new-version recipes. Step-back never fires in-sweep: read resolve_upgrade_base — it steps back ONLY when canonical==head version; the sweep RUNs only when latest tag > canonical, so the in-sweep base is strictly older → no same-version run is ever constructed. samever's same-version behaviour stays owned by the samever phase (PR path).

M2.7 disk budget — PASS. / 38G free (74% used); du -sh /var/lib/ci-warm = 1.1G; docker volumes 2.0GB. 16 retained canonicals fit with ample headroom at full 20-enrolled; no recipe dropped for disk (DECISIONS).

M2.8 UPGRADE_BASE_VERSION retired — PASS. Read resolve_upgrade_base source in full: the string UPGRADE_BASE_VERSION appears ONLY in the docstring (documenting its §2.G removal) — there is NO live override branch; resolution is purely dynamic (canonical-as-base + same-version step-back). grep -rn UPGRADE_BASE_VERSION runner/ tests/ docs/ = comments only; unit suite 294 pass. plausible: canonical 3.1.0+v2.0.0 == head → resolver steps back to newest_older_version = 3.0.1+v2.0.0 (re-derived live) — the exact known-good base the old pin forced, avoiding the broken clickhouse-404 3.0.0. §2.G GATE (keep-if-broken) correctly does NOT apply.

Reusability (warm reattach) — PASS (my own cold run). MODE=quick reattach of custom-html: booted the warm stack from the RETAINED volume, test_content_roundtrip + test_custom_html_returns_200 PASSED (retained-volume content reused, 200 over the warm domain), quick PASS → known-good UNCHANGED. canonical version/commit identical before/after (1.13.0+1.31.1 / df2e273; only ts touched = benign status refresh, not a promote). This also independently confirms warm-domain HTTPS health WORKS for a non-bluesky recipe.

Carried scrutiny — all CLEARED:

  • gitea app.ini exception is RECIPE-specific, not machinery: gitea-rootless mounts app.ini read-only by its own recipe (recipe_meta.py:68); our warm-promote/deploy_canonical code does not mount app.ini RO (grep). Cold-fresh 3.6.0 passes, warm reattach-advance crashes at config-load → recipe/retained-volume interaction. 3.5.3 known-good correctly kept.
  • bluesky warm-routing is recipe-specific: cold green + PDS 200 internal, warm domain /xrpc/_health→000; the other 15 promoted answer 200 over HTTPS (custom-html verified live by my reattach). Not machinery.
  • mattermost-lts (test_restore) + mumble (test_handshake) reds: tests UNMODIFIED this phase (git log: last touched phases 2/cfold), 0 xfail/skip markers — genuine reds, not weakened to dodge.
  • All 6 exceptions (keycloak, gitea, discourse, mattermost-lts, mumble, bluesky) recorded in DECISIONS with reasons — none silent.

Guardrail NO-AI-at-runtime — PASS. grep of nightly_sweep.py / warm_reconcile.py / recipe-mirror-sync.sh for anthropic|claude|openai|llm|gpt → zero calls (one code comment only). Pure script + systemd timer.

Verdict: M2 PASS. No VETO. All §5 Definition-of-Done items Adversary-cold-verified: tagged-release canonicals are real + reusable (untagged never promotes), mirror-sync faithful (M1), new-release-tag trigger skips no-new-version / runs new-tag (version-keyed), promote only on green-cold-latest-enrolled- tagged, demonstrated end-to-end in a real non-hollow production timer fire, run-twice determinism no-op (operative form, deviation flagged), samever orthogonal (step-back never fires in-sweep), all recipes enrolled + disk budget recorded, UPGRADE_BASE_VERSION retired (plausible dynamic base 3.0.1), AI-free runtime. M1 + M2 both fresh-PASS. The Builder may write ## DONE. (Consulted JOURNAL-canon.md only AFTER writing this verdict for context: no surprises.)