# REVIEW-2w — Adversary verdicts for Phase 2w (warm canonical + `--quick`) Adversary-owned ledger. Append-only. Formal verdicts live here; gate claims live in STATUS-2w.md, findings in BACKLOG-2w.md `## Adversary findings`. **Definition of Done verified here:** WC1–WC9 (see `plan-phase2w-warm-canonical-quick.md` §1). Each needs an independent COLD verdict before `## DONE` is permitted. The marquee proof is **WC9**: deliberately fail a PR under `--quick` and confirm the canonical's last-known-good is restored intact (data preserved) AND a `--quick` pass did not move the known-good. ## Verification map (what I will re-run cold per gate) - **WC1** live-warm keycloak: dependent recipe's SSO custom tests pass against warm keycloak; concurrent dependents use distinct namespaced realms (no collision); leftover realms reaped. - **WC2** data-warm canonical: canonical at a stable domain (≠ cold `-<6hex>`); declarative registry tracks recipe→commit; re-warmable from scratch. - **WC3** snapshots: raw volume copy taken while UNDEPLOYED under stable path; one last-known-good per app, atomic replace; restore brings app back healthy with data. - **WC4** `--quick`: reattach canonical → upgrade to PR head → generic UPGRADE+serving+custom; PASS→undeploy keep volume, known-good unchanged; FAIL→restore snapshot then undeploy; never promotes. - **WC5** cold-only advancement: green full-cold on latest re-snapshots+re-tags; only cold advances. - **WC6** nightly full-cold sweep: scheduled, declarative, MAX_TESTS-bounded. - **WC7** trigger/authority/labeling: default `!testme`=cold; `--quick` opt-in, never gates merge; results carry mode; no-canonical fallback clean. - **WC8** resource safety: warm runs serialize per app; warm keycloak shared via per-run realms; disk monitored+pruned; cold teardown still deletes per-run volumes; warm data excluded from D8 closure. - **WC9** docs + cold verify incl. rollback proof; no softened tests. --- ## @2026-05-28 — Phase 2w start (Adversary online) - Phase 2w interjected by operator (2026-05-28); Phase 2 paused. No 2w gates CLAIMED yet — Builder has not bootstrapped STATUS-2w.md. Phase-2 Docker Hub rate-limit fix was the last completed work. - COLD access re-verified: `cc-ci-tailscaled` active; `ssh cc-ci` → NixOS 24.11 (50ab793); wildcard `*.ci.commoninternet.net` → gateway 143.244.213.108. Verification path is live. - IDLE until the Builder claims a WC gate (watchdog will ping on claim). Standing veto power retained. ## @2026-05-28 — Design update absorbed (orchestrator: unpin + health-gated rollback) SSOT updated (committed). Revised/added verification obligations I will hold the gate to: - **WC1 (revised)** — keycloak is now **UNPINNED** like traefik: reconciler `abra recipe fetch` latest + chaos-deploy; `kcVersion` pin DROPPED; MUST keep the *secret-generate-only-if-missing* guard + the health-wait. Cold-check: no version pin in the nix module / reconciler; recipe fetched at activation (runtime) so the nix closure stays byte-identical (D8 preserved — verify closure hash unaffected by which keycloak version is live). Plus original WC1: dependent SSO custom tests pass against warm keycloak; concurrent dependents use distinct namespaced realms (no collision); stale realms reaped. - **WC1.1 (NEW)** — health-gated deploy-with-rollback built INTO the warm/infra reconcilers (traefik + keycloak), NOT nix-generation rollback (the swarm app isn't in the generation). Pattern: record running version = last-good → deploy latest → health-check → healthy: commit last-good:=latest; unhealthy: roll back to last-good + `PushNotification` alert. Stateful (keycloak): undeploy → raw snapshot data volume → deploy latest → health-check → on fail restore snapshot + redeploy prior version (forward DB migrations make version-only rollback unsafe); reuse WC3 snapshot helper. traefik (stateless) = version rollback only. **ADVERSARY PROOF (mandatory, I must run it):** (a) force/simulate a BROKEN "latest" → confirm the warm app self-reverts to the prior healthy version, keycloak's **pre-upgrade data intact**, and an alert fired; (b) a HEALTHY update commits the new version as last-good. Watch for: silent failure (broken stays deployed), data loss on revert, no alert, or last-good not advancing on a healthy update. - **WC6 (reordered)** — nightly = `nixos-rebuild switch` FIRST (warm/infra → latest, health-gated per WC1.1) THEN full-cold sweep; MUST NOT run while a test run is in flight; if the health-gate rolled an infra app back, alert fires and the sweep still runs against the healthy prior version. - **WC8 carry** — confirm the leftover phase-2 cold app `lasu-0a6fb2` (orchestrator flagged it) is fully torn down (app+volumes+secrets gone), since cold-teardown-sacred + disk budget are WC8. - Still no gate CLAIMED; W0 in flight. Continue idle until a WC gate is claimed (watchdog pings). ## @2026-05-29 — WC1.2 added (pre-deploy safety gate, runs BEFORE WC1.1) - **WC1.2 (NEW)** — pre-deploy safety gate on warm/infra auto-update. Rationale: a passing health check does NOT prove a required manual migration ran, so gate BEFORE auto-deploy. Rule: only auto-apply **non-major (patch/minor)** upgrades with **no manual-migration release notes**. If current→latest is a **MAJOR recipe-version bump** OR the target `releaseNotes/.md` flags a manual migration → **DO NOT auto-upgrade**: stay on current + `PushNotification` alert **WITH the release notes** (operator upgrades manually). Independent of, and runs BEFORE, the WC1.1 health-gated rollback. Applies to nightly rebuild (WC6) AND any reconcile. - Detection (verify the impl uses both): primary = major recipe-version bump (coop-cloud version `+`; a major **recipe-semver** bump = breaking, matches abra major-upgrade caution); secondary = scan target `releaseNotes/.md` for manual-migration markers. - **ADVERSARY PROOF (mandatory):** simulate a major / manual-migration "latest" → confirm **hold-on-current** (no deploy attempted) + alert fired **carrying the release notes**; NO silent auto-upgrade. Watch for: a major bump slipping through as if patch; releaseNotes not scanned; alert without the notes; or the gate firing on a legitimate patch/minor (false hold). - Ordering check: WC1.2 must short-circuit BEFORE WC1.1 even snapshots/deploys — i.e. on a held upgrade there is no snapshot/deploy/rollback churn, just hold + alert.