W0.7 (lasuite-docs race was transient) + W0.8 headline e2e: lasuite-docs custom pass (3 SSO tests incl. oidc_login + password_grant) vs WARM keycloak, deploy-count=1 (keycloak NOT co-deployed), per-run realm lasuite-docs-4c0858 created+deleted; warm kc left with only master realm. Concurrency+reaping proven (distinct realms for concurrent same-recipe runs; reap keeps-live/deletes-orphans). Gate claim in STATUS-2w carries full WHAT/HOW/EXPECTED/WHERE for cold verify. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3.5 KiB
3.5 KiB
BACKLOG — Phase 2w (warm canonical + --quick)
Single-writer rule (plan §6.1): Builder edits ## Build backlog only; Adversary edits
## Adversary findings only.
Build backlog
W0 — Live-warm keycloak (WC1, WC1.1, WC1.2)
- W0.1 — sso.py realm lifecycle (
list_realms/delete_keycloak_realm/realms_to_reap/reap_orphaned_realms) + 8 unit tests. DONE (74bf8c1). - W0.2 — Orchestrator live-warm dep mode (warm.py + run_recipe_ci warm/cold split, per-run
namespaced realm, realm-delete teardown, cold fallback, deploy-count). DONE (
1b8d26b). Core mechanism proven deploy-free on the live warm keycloak. - W0.3a — Declarative reconciler
nix/modules/warm-keycloak.nixup + verified via rebuild. DONE (88c1114) but INTERIM (pinned + skip-if-healthy) — superseded by W0.6 below. - W0.5 — WC3 snapshot/restore helper (
runner/harness/warmsnap.py) DONE (4cc1e15) — live round-trip proven; later moved snapshot into<recipe>/snapshot/subdir so last_good survives. - W0.6 — Rewrite reconciler: unpin + WC1.2 safety gate + WC1.1 scaffold DONE (
a044abb).runner/warm_reconcile.pypython entrypoint in the nix store; unpinned (deploy latest tag); WC1.2 holds proven live; WC1.1 health-gate no-op path live. (traefik migration → later.) - W0.7 — lasuite-docs redeploy race RESOLVED — it was transient resource contention from the killed stale Phase-2 run; converges fine on the clean system. No recipe/harness change needed.
- W0.8 — Headline WC1 e2e GREEN (b34mcluc4): lasuite-docs custom pass (3 SSO tests incl. oidc login + password grant) vs warm keycloak, deploy-count=1, per-run realm created+deleted; concurrency (distinct realms) + reaping proven.
- W0.9 — WC1.1 live proofs PASS (
32f0071): marquee rollback (broken latest → self-revert + data intact + alert, last_good not advanced) + healthy upgrade commits last_good. WC1.2 holds (W0.6). - WC8 fix (found en route): docker autoPrune
--volumesremoved (was failing daily + would delete warm volumes) (e73e439). - W0.10 (follow-up, post-gate): wire the Builder-loop alert relay
(
/var/lib/ci-warm/alerts/*.json→ PushNotification →alerts/seen/); apply the WC1.1/WC1.2 health-gated+safety-gate pattern to the traefik reconciler (proxy.nix, stateless = version rollback only). → folds into WC1.1/WC8 final verification.
→ Gate WC1 + WC1.1 + WC1.2 CLAIMED in STATUS-2w (awaiting Adversary).
W1 — Canonical registry (WC2)
- W1.1 — Canonical registry/reconciler (declarative; tracks recipe→known-good commit; stable
domain
warm-<recipe>). (Snapshot/restore done in W0.5; WC3 closes with W1's canonicals.)
W2 — --quick mode (WC4, WC7)
- W2.1 —
run_recipe_ci.py --quickpath (reattach → upgrade-to-PR-head → assert → PASS undeploy / FAIL restore+undeploy; never promote). - W2.2 — Trigger surface + labeling + no-canonical fallback (WC7).
W3 — Cold-advances-canonical + nightly sweep (WC5, WC6)
- W3.1 — Promote-on-green-cold (snapshot+tag canonical at teardown on green cold; seed on first green).
- W3.2 — Nightly full-cold sweep (declarative scheduler, MAX_TESTS-bounded).
W4 — Hardening + docs + cold verify (WC8, WC9)
- W4.1 — Resource/isolation hardening: disk monitor+prune, per-app serialize, warm excluded from D8.
- W4.2 — Docs (warm/quick) + the WC9 rollback proof.
Adversary findings
(none yet)