status+journal(2w): W0.9 WC1.1 live proofs PASS (healthy upgrade + marquee rollback); reconciler-side WC1/WC1.1/WC1.2 proven

This commit is contained in:
2026-05-29 01:21:59 +01:00
parent 32f00717ac
commit 819c1bc0fd
2 changed files with 58 additions and 11 deletions

View File

@ -68,18 +68,27 @@ nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversa
reconciler → noop-healthy (system 0-failed, 200); **WC1.2 holds proven** (MAJOR → held-major,
keycloak untouched; minor+manual-migration notes → held-manual-migration, alert carries notes).
**Next:**
1. **W0.9 — WC1.1 live proofs** (deploy cycles): (a) healthy upgrade — stage a fake newer tag
(re-tag of current → same healthy image) → reconcile snapshots + deploys + commits last-good;
(b) **rollback (marquee)** — stage a fake newer tag with a BROKEN compose (bad KC_HOSTNAME →
crash-loop) → reconcile snapshots → deploys broken → health-gate fails → restores snapshot +
redeploys prior → healthy + data intact (marker realm) + alert written + last_good NOT advanced.
2. **W0.7** — fix the lasuite-docs in-place-redeploy nginx-upstream race OR pick a more-robust SSO
dependent for the headline proof.
3. **W0.8** — headline WC1 e2e: dependent SSO custom test green vs warm keycloak + concurrent
- **W0.9 WC1.1 live proofs** DONE (32f0071). PROVEN on warm keycloak (annotated fake tags +
CCCI_SKIP_FETCH): (a) healthy upgrade 10.7.1→10.7.9 — snapshot+deploy+health-pass, last_good
committed, marker preserved; (b) **marquee rollback** — broken latest 10.7.10 → deploy fails →
rollback to 10.7.9, HEALTHY, marker realm INTACT (data preserved), last_good NOT advanced, rollback
alert written (attempted=10.7.10,last_good=10.7.9,recovered=True); recovered to canonical
10.7.1+26.6.2. Fixed 4 issues live (deploy-fail→rollback, warmsnap last_good subdir, wait_undeployed
swarm-settle, abra-stdout capture). 57 unit pass. **Reconciler-side WC1/WC1.1/WC1.2 proven.**
**Adversary reproduce (W0.9):** on cc-ci, with the keycloak recipe clone, create annotated fake
tags (peel `^{}`, set git identity) `10.7.9+26.6.2`(=good commit) and `10.7.10+26.6.2`(broken
KC_HOSTNAME), then `CCCI_SKIP_FETCH=1 cc-ci-run runner/warm_reconcile.py keycloak` twice; observe
`upgraded:` then `rolled-back:`, marker realm survives, `/var/lib/ci-warm/keycloak/last_good`
unchanged at the prior version, a `*rollback*.json` alert under `/var/lib/ci-warm/alerts/`.
**Next (remaining for WC1 gate):**
1. **W0.7** — fix the lasuite-docs in-place chaos-redeploy nginx-upstream race (`host not found in
upstream ...backend:8000`) OR pick a more-robust SSO dependent for the headline proof.
2. **W0.8** — headline WC1 e2e: dependent SSO custom test green vs warm keycloak + concurrent
distinct realms (no collision) + reaping. → claim WC1/WC1.1/WC1.2.
4. **Builder-loop alert relay** — on each wake, scan `/var/lib/ci-warm/alerts/*.json`, PushNotification
+ record + archive to `alerts/seen/` (wire when first real alert can occur, i.e. with nightly WC6).
3. **Builder-loop alert relay** (deferred wiring) — on each wake, scan `/var/lib/ci-warm/alerts/*.json`,
PushNotification + record + archive to `alerts/seen/`; wire when nightly WC6 lands (first real alert).
**Build finding (mine, to fix):** lasuite-docs `setup_custom_tests` in-place `abra app deploy
--force --chaos` (OIDC wiring) fails: nginx `web` fatally exits `[emerg] host not found in upstream