status+journal(2w): W0 gate WC1+WC1.2+WC1.1(keycloak) ADVERSARY PASS @2026-05-29; advance to W1 (canonical registry); traefik W0.10 tracked before DONE

This commit is contained in:
2026-05-29 02:10:55 +01:00
parent 31ac86d644
commit 56a95c68ef
2 changed files with 54 additions and 18 deletions

View File

@ -213,3 +213,23 @@ Claiming the WC1/WC1.1/WC1.2 gate.
Note: the reconciler WRITES alert sentinels to /var/lib/ci-warm/alerts/ (proven for rollback +
holds). The Builder-loop RELAY (sentinel → PushNotification + archive to seen/) runs each wake when an
alert is present; none currently. This delivery layer is loop behavior, not reconciler logic.
## 2026-05-29 — Gate WC1+WC1.2+WC1.1(keycloak) ADVERSARY PASS; advancing to W1
The Adversary cold-verified all 6 checks from its OWN clone (`cc-ci:/root/cc-ci-adv-verify`):
check1 unpinned/healthy/wired, check2 57 units, check3 headline lasuite-docs SSO e2e (install+custom
pass, deploy-count=1, per-run realm created+deleted, warm kc left `['master']`, cold teardown sacred),
check4 concurrency+reaping, check5 WC1.1 marquee rollback (data intact, last_good held, alert), check6
WC1.2 holds. **Gate verdict: PASS @2026-05-29** (REVIEW-2w 31ac86d) for exactly the claimed scope.
The Adversary independently hit + correctly attributed the same test-script cleanup footgun to the
test, not the reconciler. ONE tracked-open before DONE (no finding): traefik WC1.1 (W0.10) — its
stateless version-rollback isn't yet on the shared reconciler.
**Advancing to W1 (WC2 canonical registry + WC3 closure).** Design intent: a small declarative
registry of canonical recipes → known-good commit, each at `warm-<recipe>` kept DATA-warm (undeployed
when idle, volume retained), re-warmable. warmsnap (W0.5) already provides one-last-good snapshot +
restore. Need to decide: registry format/location (in-repo declarative) + the data-warm lifecycle
(deploy→use→undeploy-keep-volume) + how a canonical is seeded/advanced (WC5 cold-only, later). W1
builds the registry + data-warm reconcile; WC5/WC6 (promote-on-green-cold + nightly) come in W3.
traefik W0.10 + alert-relay deferred to a quiet window before DONE (traefik is critical TLS infra).

View File

@ -12,15 +12,14 @@ canonical and upgrades to PR head (rolling back on failure), cold-only canonical
nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversary cold-verified.
## Definition of Done (Phase 2w) — WC1WC9 (+WC1.1/WC1.2), each Adversary cold-verified in REVIEW-2w
- [ ] **WC1** — Live-warm keycloak (SSO dep) at a stable domain, **UNPINNED** (fetch latest + chaos
deploy, like traefik; keep secret-generate-only-if-missing + health-wait); dependents
create+delete per-run namespaced realms; concurrent dependents don't collide; leftover realms reaped.
- [ ] **WC1.1** — Health-gated deploy-with-rollback in warm/infra reconcilers (traefik+keycloak):
record last-good → deploy latest → health-check → healthy commits last-good:=latest; unhealthy
rolls back + alerts. Stateful (keycloak): snapshot data volume before upgrade, restore on
rollback (reuse WC3 helper). traefik = version rollback only.
- [ ] **WC1.2** — Pre-deploy safety gate: auto-apply only non-major/no-manual-migration bumps; a
MAJOR bump or manual-migration release notes → stay on current + alert with notes (no silent auto-upgrade).
- [x] **WC1** — Live-warm UNPINNED keycloak; per-run namespaced realms (create+delete); concurrent
distinct realms; orphan realms reaped. **Adversary PASS @2026-05-29** (REVIEW-2w, gate 985686f).
- [~] **WC1.1** — Health-gated deploy-with-rollback. **keycloak (stateful) — Adversary PASS
@2026-05-29** (marquee: broken latest → snapshot→restore→prior, data intact, last_good held,
alert). **traefik (stateless, version-rollback-only) — NOT yet migrated = W0.10**, MUST close
before Phase-2w DONE (Adversary will require a cold proof).
- [x] **WC1.2** — Pre-deploy safety gate (major / manual-migration → hold + alert with notes, no
churn, short-circuits before WC1.1). **Adversary PASS @2026-05-29**.
- [ ] **WC2** — Data-warm canonical model: per-recipe canonical at a stable domain, declarative
registry tracking recipe→known-good commit; re-warmable from scratch.
- [ ] **WC3** — Known-good snapshots: raw volume copy taken while undeployed under stable path; one
@ -37,7 +36,8 @@ nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversa
confirm last-known-good restored intact; a `--quick` pass did not move the known-good).
## Milestones (plan §3)
- **W0** — Warm keycloak (WC1). ← IN FLIGHT
- **W0** — Warm keycloak (WC1/WC1.1-keycloak/WC1.2). ✅ Adversary PASS @2026-05-29.
- **W1** — Canonical registry + snapshot/restore (WC2, WC3). ← IN FLIGHT
- **W1** — Canonical registry + snapshot/restore (WC2, WC3).
- **W2** — `--quick` mode (WC4, WC7).
- **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6).
@ -82,13 +82,24 @@ nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversa
`upgraded:` then `rolled-back:`, marker realm survives, `/var/lib/ci-warm/keycloak/last_good`
unchanged at the prior version, a `*rollback*.json` alert under `/var/lib/ci-warm/alerts/`.
**Next (remaining for WC1 gate):**
1. **W0.7** — fix the lasuite-docs in-place chaos-redeploy nginx-upstream race (`host not found in
upstream ...backend:8000`) OR pick a more-robust SSO dependent for the headline proof.
2. **W0.8** — headline WC1 e2e: dependent SSO custom test green vs warm keycloak + concurrent
distinct realms (no collision) + reaping. → claim WC1/WC1.1/WC1.2.
3. **Builder-loop alert relay** (deferred wiring) — on each wake, scan `/var/lib/ci-warm/alerts/*.json`,
PushNotification + record + archive to `alerts/seen/`; wire when nightly WC6 lands (first real alert).
**W0 COMPLETE — Adversary PASS @2026-05-29.** Now in **W1 (canonical registry, WC2/WC3)**.
**W1 plan (WC2 data-warm canonical model + WC3 closure):**
- WC2: a declarative **canonical registry** — which recipes are canonical + at which known-good
commit/version — with each canonical app at a **stable domain `warm-<recipe>`**, kept **data-warm**
(undeployed-when-idle, data volume retained). Re-warmable from scratch (cache). Reconciler/registry
declared in-repo.
- WC3: snapshots (warmsnap, W0.5 — done) tied to canonicals: one last-good per canonical under
`/var/lib/ci-warm/<recipe>/`, restore proven (done). Close WC3 with the canonical model.
- Distinguish from W0's live-warm keycloak: canonicals are DATA-warm (undeployed when idle), keycloak
is LIVE-warm (always up). Both use the `warm-<recipe>` stable scheme.
**Tracked before Phase-2w DONE (not blocking W1):**
- **W0.10a — traefik WC1.1** (Adversary requires a cold proof): migrate `proxy.nix` onto the shared
health-gated reconciler (stateless = version-rollback-only; preserve cert-secret/WILDCARDS_ENABLED/
COMPOSE_FILE setup). CAREFUL — traefik serves all TLS; deploy/test only in a quiet window.
- **W0.10b — Builder-loop alert relay**: each wake, scan `/var/lib/ci-warm/alerts/*.json`
PushNotification → archive to `alerts/seen/`.
**Build finding (RESOLVED):** the W0.4 lasuite-docs `setup_custom_tests` redeploy failure (nginx web
`host not found in upstream ...backend:8000`) was **transient resource contention** from the
@ -97,7 +108,12 @@ headline e2e is green (below). No recipe/harness change needed.
## Gate
### Gate: WC1 + WC1.1 + WC1.2 — CLAIMED, awaiting Adversary (@2026-05-29, HEAD = see `git log -1`)
### Gate: WC1 + WC1.2 + WC1.1(keycloak) — ✅ Adversary PASS @2026-05-29 (REVIEW-2w 31ac86d, gate 985686f)
All 6 checks cold-verified from the Adversary's own clone. Builder may proceed to W1. **Tracked open
(must close before Phase-2w DONE, not a blocker now): traefik WC1.1 (W0.10)** — stateless
version-rollback not yet on the shared health-gated reconciler; Adversary will require a cold proof.
(claim detail retained below for the record)
**WHAT.** The live-warm keycloak layer (W0): a persistent **unpinned** keycloak at the stable domain
`warm-keycloak.ci.commoninternet.net`, declaratively reconciled, that SSO-dependent runs use via a