- Seed STATUS-2w / BACKLOG-2w / JOURNAL-2w (WC1-WC9 DoD, W0-W4 milestones). - Tore down leftover Phase-2 cold apps (lasu-0a6fb2/keyc-07d81e/lasu-dbg); disk 91%->86%. - DECISIONS: warm-domain scheme, per-run realm isolation, warm keycloak as declarative infra, cold fallback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3.8 KiB
JOURNAL — Phase 2w (warm canonical + --quick) — Builder
Append-only reasoning log (WHY). Facts/verification go in STATUS-2w; verdicts in REVIEW-2w.
2026-05-28 — Phase 2w bootstrap + cleanup + W0 design
Orientation. Operator interjected Phase 2w into Phase 2 (Phase 2 paused, state preserved).
Read the 2w plan + plan.md §6.1/§7/§9. Adversary already online (REVIEW-2w @2026-05-28 start),
idle awaiting a WC gate claim. Seeded STATUS-2w/BACKLOG-2w/JOURNAL-2w.
In-flight Phase 2 work committed. Working tree had an uncommitted edit to
tests/lasuite-drive/setup_custom_tests.sh (Q3.2 MinIO bucket creation via the createbuckets
one-shot) — the continuation of commit 6557197. Committed it (66e065d) with an honest message: not
yet live-verified (needs a lasuite-drive deploy once warm keycloak exists). This preserves Phase 2
progress at the pause point; it resumes after 2w DONE.
Cleanup (orchestrator-requested). cc-ci / was at 91% (only 2.4G free) — a real WC8 concern
before adding warm volumes/snapshots. Tore down the leftover COLD per-run apps from paused Phase 2
via lifecycle.teardown_app(..., verify=True): lasu-0a6fb2 (12-service lasuite-drive, heaviest),
keyc-07d81e (cold keycloak), lasu-dbg (debug lasuite). All TEARDOWN OK, no residual. Disk →
86% (3.8G free). Only infra stacks remain (backups, bridge, dashboard, drone, traefik). Did NOT
docker image prune — 9.7GB reclaimable but the image cache is the warm pull-cache; with authed
Docker Hub pulls now wired, a re-pull is billed to the account (cheaper) but still slow, so keep the
cache. Disk is the Phase-2w budget (WC8) — monitor.
W0 design (WC1 — live-warm keycloak). The existing SSO harness is already most of the way there:
sso.setup_keycloak_realm(provider_domain, realm, client_id, ...)creates a realm+client+user idempotently via the admin API, and_kc_admin_passwordreads the admin password from inside the running container (docker exec ... cat /run/secrets/admin_password). So it works against ANY running keycloak — cold or warm — with no external password handling.- The orchestrator dep flow (
run_recipe_ci.py):declared_deps→deploy_deps(fresh co-deploy per run) →_enrich_deps_with_sso(creates realm, realm name currently =parent_recipe) →setup_custom_tests.shhook → teardown_deps (undeploy).
What WC1 changes:
- The realm becomes the per-run isolation unit on a shared live-warm keycloak. Realm name must
be unique per (parent, pr, ref) so concurrent dependents don't collide — change from
realm=parent_recipetorealm=<parent>-<6hex>(derive the hex from the parent's per-run domain label so it's stable within a run and distinct across concurrent runs). - The keycloak dep is not co-deployed: point at the stable warm domain; on teardown delete the realm (not undeploy keycloak). Fall back to cold co-deploy if no warm keycloak is present (so a from-scratch / no-warm environment still works — the warm keycloak is an optimization layer).
- The warm keycloak itself is declarative infra (Nix reconciler, like traefik) — NOT warm data (so it IS in the D8 closure as a reconciler; its realm data is ephemeral per-run anyway). Re-warmable from scratch.
Stable-domain scheme decision: warm-<recipe>.ci.commoninternet.net (here warm-keycloak...),
clearly distinct from cold <recipe[:4]>-<6hex>. Risk: longer stack name → swarm 64-char
config/secret limit; will verify on first deploy and shorten if it overflows.
Building W0 in increments (each verified): (1) sso realm lifecycle prims + units; (2) deploy warm keycloak manually at the stable domain and prove realm create→delete via admin API; (3) wire the orchestrator live-warm mode; (4) declarative Nix reconciler; (5) e2e + concurrency + reaping proof.