Files
cc-ci/machine-docs/BACKLOG-2w.md
autonomic-bot 985686f60e claim(2w): Gate WC1+WC1.1+WC1.2 CLAIMED — warm keycloak headline e2e GREEN + concurrency/reaping + rollback/holds proven
W0.7 (lasuite-docs race was transient) + W0.8 headline e2e: lasuite-docs custom
pass (3 SSO tests incl. oidc_login + password_grant) vs WARM keycloak,
deploy-count=1 (keycloak NOT co-deployed), per-run realm lasuite-docs-4c0858
created+deleted; warm kc left with only master realm. Concurrency+reaping proven
(distinct realms for concurrent same-recipe runs; reap keeps-live/deletes-orphans).
Gate claim in STATUS-2w carries full WHAT/HOW/EXPECTED/WHERE for cold verify.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 01:40:32 +01:00

3.5 KiB

BACKLOG — Phase 2w (warm canonical + --quick)

Single-writer rule (plan §6.1): Builder edits ## Build backlog only; Adversary edits ## Adversary findings only.

Build backlog

W0 — Live-warm keycloak (WC1, WC1.1, WC1.2)

  • W0.1 — sso.py realm lifecycle (list_realms/delete_keycloak_realm/realms_to_reap/ reap_orphaned_realms) + 8 unit tests. DONE (74bf8c1).
  • W0.2 — Orchestrator live-warm dep mode (warm.py + run_recipe_ci warm/cold split, per-run namespaced realm, realm-delete teardown, cold fallback, deploy-count). DONE (1b8d26b). Core mechanism proven deploy-free on the live warm keycloak.
  • W0.3a — Declarative reconciler nix/modules/warm-keycloak.nix up + verified via rebuild. DONE (88c1114) but INTERIM (pinned + skip-if-healthy) — superseded by W0.6 below.
  • W0.5 — WC3 snapshot/restore helper (runner/harness/warmsnap.py) DONE (4cc1e15) — live round-trip proven; later moved snapshot into <recipe>/snapshot/ subdir so last_good survives.
  • W0.6 — Rewrite reconciler: unpin + WC1.2 safety gate + WC1.1 scaffold DONE (a044abb). runner/warm_reconcile.py python entrypoint in the nix store; unpinned (deploy latest tag); WC1.2 holds proven live; WC1.1 health-gate no-op path live. (traefik migration → later.)
  • W0.7 — lasuite-docs redeploy race RESOLVED — it was transient resource contention from the killed stale Phase-2 run; converges fine on the clean system. No recipe/harness change needed.
  • W0.8 — Headline WC1 e2e GREEN (b34mcluc4): lasuite-docs custom pass (3 SSO tests incl. oidc login + password grant) vs warm keycloak, deploy-count=1, per-run realm created+deleted; concurrency (distinct realms) + reaping proven.
  • W0.9 — WC1.1 live proofs PASS (32f0071): marquee rollback (broken latest → self-revert + data intact + alert, last_good not advanced) + healthy upgrade commits last_good. WC1.2 holds (W0.6).
  • WC8 fix (found en route): docker autoPrune --volumes removed (was failing daily + would delete warm volumes) (e73e439).
  • W0.10 (follow-up, post-gate): wire the Builder-loop alert relay (/var/lib/ci-warm/alerts/*.json → PushNotification → alerts/seen/); apply the WC1.1/WC1.2 health-gated+safety-gate pattern to the traefik reconciler (proxy.nix, stateless = version rollback only). → folds into WC1.1/WC8 final verification.

Gate WC1 + WC1.1 + WC1.2 CLAIMED in STATUS-2w (awaiting Adversary).

W1 — Canonical registry (WC2)

  • W1.1 — Canonical registry/reconciler (declarative; tracks recipe→known-good commit; stable domain warm-<recipe>). (Snapshot/restore done in W0.5; WC3 closes with W1's canonicals.)

W2 — --quick mode (WC4, WC7)

  • W2.1 — run_recipe_ci.py --quick path (reattach → upgrade-to-PR-head → assert → PASS undeploy / FAIL restore+undeploy; never promote).
  • W2.2 — Trigger surface + labeling + no-canonical fallback (WC7).

W3 — Cold-advances-canonical + nightly sweep (WC5, WC6)

  • W3.1 — Promote-on-green-cold (snapshot+tag canonical at teardown on green cold; seed on first green).
  • W3.2 — Nightly full-cold sweep (declarative scheduler, MAX_TESTS-bounded).

W4 — Hardening + docs + cold verify (WC8, WC9)

  • W4.1 — Resource/isolation hardening: disk monitor+prune, per-app serialize, warm excluded from D8.
  • W4.2 — Docs (warm/quick) + the WC9 rollback proof.

Adversary findings

(none yet)