- Seed STATUS-2w / BACKLOG-2w / JOURNAL-2w (WC1-WC9 DoD, W0-W4 milestones). - Tore down leftover Phase-2 cold apps (lasu-0a6fb2/keyc-07d81e/lasu-dbg); disk 91%->86%. - DECISIONS: warm-domain scheme, per-run realm isolation, warm keycloak as declarative infra, cold fallback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3.9 KiB
STATUS — Phase 2w (warm canonical deployments + --quick CI mode)
Phase plan (SSOT): /srv/cc-ci/cc-ci-plan/plan-phase2w-warm-canonical-quick.md
Loop state for THIS phase: STATUS-2w / BACKLOG-2w / REVIEW-2w / JOURNAL-2w (DECISIONS.md shared).
Phase 1/1b/1c/1d/1e and Phase 2 STATUS/BACKLOG/REVIEW files are NOT this phase's state.
Phase 2 is PAUSED (STATUS-2/BACKLOG-2 intact) and resumes after 2w ## DONE.
Phase
Add a warm-data layer to cc-ci CI: a live-warm shared keycloak for SSO deps, data-warm per-recipe
canonicals at stable domains, known-good snapshots, an opt-in --quick fast lane that reattaches the
canonical and upgrades to PR head (rolling back on failure), cold-only canonical advancement, and a
nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversary cold-verified.
Definition of Done (Phase 2w) — WC1–WC9, each Adversary cold-verified in REVIEW-2w
- WC1 — Live-warm keycloak (SSO dep) at a stable domain; dependents create+delete per-run namespaced realms; concurrent dependents don't collide; leftover realms reaped.
- WC2 — Data-warm canonical model: per-recipe canonical at a stable domain, declarative registry tracking recipe→known-good commit; re-warmable from scratch.
- WC3 — Known-good snapshots: raw volume copy taken while undeployed under stable path; one last-known-good per app, atomic replace; restore proven to round-trip data.
- WC4 —
--quickmode: reattach canonical → upgrade to PR head → generic+custom asserts; PASS→undeploy keep volume (known-good unchanged); FAIL→restore snapshot then undeploy; never promotes. - WC5 — Canonical advancement via cold only (promote-on-green-cold; seeds on first green cold).
- WC6 — Nightly full-cold sweep (scheduled, declarative, MAX_TESTS-bounded).
- WC7 — Trigger/authority/labeling: default
!testme=cold;--quickopt-in, never gates merge; results carry mode; clean no-canonical fallback. - WC8 — Resource safety + isolation: warm runs serialize per app; warm keycloak shared via per-run realms; disk monitored+pruned; cold teardown sacred; warm data excluded from D8 closure.
- WC9 — Docs + cold verify incl. the rollback proof (deliberately fail a PR under
--quick, confirm last-known-good restored intact; a--quickpass did not move the known-good).
Milestones (plan §3)
- W0 — Warm keycloak (WC1). ← IN FLIGHT
- W1 — Canonical registry + snapshot/restore (WC2, WC3).
- W2 —
--quickmode (WC4, WC7). - W3 — Cold-advances-canonical + nightly sweep (WC5, WC6).
- W4 — Resource/isolation hardening + docs + cold verify incl. rollback proof (WC8, WC9). → DONE.
In flight
W0 — live-warm keycloak (WC1). Building incrementally:
- sso.py realm lifecycle: add
delete_keycloak_realm+list_realms+reap_stale_realms(realm is the per-run isolation unit on a shared keycloak). - Orchestrator dep path: live-warm mode for the keycloak dep — use the stable warm domain + a per-run namespaced realm (not realm=parent_recipe), delete the realm on teardown instead of undeploying keycloak. Fall back to cold co-deploy if no warm keycloak present.
- Declarative Nix reconciler (
nix/modules/warm-keycloak.nix) — systemd oneshot converges the warm keycloak to deployed+healthy at the stable domain. - e2e proof + concurrency (distinct realms) + reaping → claim WC1.
Gate
(none claimed yet)
Blocked
(none)
Notes
- Disk budget (WC8 watch): cc-ci
/was 91% (2.4G free) at phase start; freed orphaned Phase-2 cold apps (lasu-0a6fb2 12-svc, keyc-07d81e, lasu-dbg) → 86% (3.8G free). 9.7GB reclaimable in Docker images kept as warm pull-cache (authenticated pulls now, so re-pull is cheaper but slower). - Stable-domain scheme (proposed, see DECISIONS):
warm-<recipe>.ci.commoninternet.net, distinct from cold<recipe[:4]>-<6hex>.