claim(2w): Gate WC1+WC1.1+WC1.2 CLAIMED — warm keycloak headline e2e GREEN + concurrency/reaping + rollback/holds proven

W0.7 (lasuite-docs race was transient) + W0.8 headline e2e: lasuite-docs custom
pass (3 SSO tests incl. oidc_login + password_grant) vs WARM keycloak,
deploy-count=1 (keycloak NOT co-deployed), per-run realm lasuite-docs-4c0858
created+deleted; warm kc left with only master realm. Concurrency+reaping proven
(distinct realms for concurrent same-recipe runs; reap keeps-live/deletes-orphans).
Gate claim in STATUS-2w carries full WHAT/HOW/EXPECTED/WHERE for cold verify.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-29 01:40:32 +01:00
parent cbc193e535
commit 985686f60e
3 changed files with 112 additions and 21 deletions

View File

@ -13,22 +13,26 @@ Single-writer rule (plan §6.1): Builder edits `## Build backlog` only; Adversar
Core mechanism proven deploy-free on the live warm keycloak.
- [x] W0.3a — Declarative reconciler `nix/modules/warm-keycloak.nix` up + verified via rebuild.
DONE (88c1114) but INTERIM (pinned + skip-if-healthy) — superseded by W0.6 below.
- [ ] **W0.5 — WC3 snapshot/restore helper FIRST** (prereq for WC1.1): `runner/harness/warmsnap.py`
— raw copy of an app's data volume(s) while undeployed, under `/var/lib/ci-warm/<recipe>/`,
atomic replace, one last-good, restore round-trips data. + unit tests + live round-trip proof.
- [ ] **W0.6 — Rewrite reconciler: unpin + WC1.2 safety gate + WC1.1 health-gated rollback.**
UNPIN keycloak (fetch latest + chaos; drop kcVersion); keep secret-guard + health-wait. WC1.2
gate: hold-on-current + alert on major/manual-migration bump (no deploy churn). WC1.1: record
last-good → keycloak undeploy→snapshot→deploy latest → health-gate → commit-or-(restore+
redeploy-prior+alert). Apply the same health-gated+safety-gate pattern to traefik (version
rollback only, stateless). Settle the alert mechanism (see DECISIONS).
- [ ] **W0.7 — Fix lasuite-docs in-place-redeploy race** (nginx web `host not found in upstream
backend` during chaos redeploy) OR pick a more-robust SSO dependent for the headline proof.
- [ ] W0.8 — Headline WC1 e2e: dependent SSO custom test green vs warm keycloak; concurrent
dependents distinct realms (no collision); leftover realms reaped. → claim WC1.
- [ ] W0.9 — WC1.1/WC1.2 Adversary proofs: simulate broken latest → self-revert + data intact +
alert; healthy update commits last-good; major/manual-migration → hold + alert-with-notes.
→ claim WC1.1/WC1.2.
- [x] **W0.5 — WC3 snapshot/restore helper** (`runner/harness/warmsnap.py`) DONE (4cc1e15) — live
round-trip proven; later moved snapshot into `<recipe>/snapshot/` subdir so last_good survives.
- [x] **W0.6 — Rewrite reconciler: unpin + WC1.2 safety gate + WC1.1 scaffold** DONE (a044abb).
`runner/warm_reconcile.py` python entrypoint in the nix store; unpinned (deploy latest tag);
WC1.2 holds proven live; WC1.1 health-gate no-op path live. (traefik migration → later.)
- [x] **W0.7 — lasuite-docs redeploy race** RESOLVED — it was transient resource contention from the
killed stale Phase-2 run; converges fine on the clean system. No recipe/harness change needed.
- [x] W0.8 — Headline WC1 e2e GREEN (b34mcluc4): lasuite-docs custom pass (3 SSO tests incl. oidc
login + password grant) vs warm keycloak, deploy-count=1, per-run realm created+deleted;
concurrency (distinct realms) + reaping proven.
- [x] W0.9 — WC1.1 live proofs PASS (32f0071): marquee rollback (broken latest → self-revert + data
intact + alert, last_good not advanced) + healthy upgrade commits last_good. WC1.2 holds (W0.6).
- [x] **WC8 fix (found en route):** docker autoPrune `--volumes` removed (was failing daily + would
delete warm volumes) (e73e439).
- [ ] **W0.10 (follow-up, post-gate):** wire the Builder-loop alert relay
(`/var/lib/ci-warm/alerts/*.json` → PushNotification → `alerts/seen/`); apply the WC1.1/WC1.2
health-gated+safety-gate pattern to the traefik reconciler (proxy.nix, stateless = version
rollback only). → folds into WC1.1/WC8 final verification.
**Gate WC1 + WC1.1 + WC1.2 CLAIMED** in STATUS-2w (awaiting Adversary).
### W1 — Canonical registry (WC2)
- [ ] W1.1 — Canonical registry/reconciler (declarative; tracks recipe→known-good commit; stable