status+journal(2w): W1 gate WC2+WC3 ADVERSARY PASS @2026-05-29; advance to W2 (--quick mode)

This commit is contained in:
2026-05-29 02:35:55 +01:00
parent 0246296370
commit 307269b5c6
2 changed files with 45 additions and 21 deletions

View File

@ -250,3 +250,20 @@ bounded). Did NOT enroll a 2nd recipe yet (custom-html suffices for W2 --quick +
Parked at the W1 gate. While awaiting: will do non-disruptive W0.10b (alert-relay) — NOT the traefik
W0.10a migration (it disrupts TLS the Adversary needs to verify the data-warm round-trip through).
## 2026-05-29 — W1 gate WC2+WC3 ADVERSARY PASS; advancing to W2 (--quick)
Adversary cold-verified WC2+WC3 from its own clone (REVIEW-2w 0246296): 61 units; its OWN data-warm
round-trip (deploy→write ADV marker→undeploy-keep-volume→redeploy→marker survived, Builder's known-good
also reattached); its OWN WC3 restore round-trip (mutate→restore→exact known-good content back,
mutation gone). Its 2 crashes were its own driver-script bugs, not product defects. Canonical left
clean. **WC2 + WC3 PASS @2026-05-29.** Same coordination lag as the W0 claim (its watchdog pinged on a
pre-claim read; resolved via ADVERSARY-INBOX). traefik WC1.1 (W0.10a) remains the sole tracked-open
before DONE.
**Advancing to W2 (--quick, WC4+WC7).** Design: a `--quick` opt-in path in run_recipe_ci.py that
consumes the canonical (reattach → upgrade-to-PR-head → assert → PASS keep-volume / FAIL
restore-snapshot, NEVER promote), tagging results mode=quick, with a clean no-canonical fallback to
cold. Will study the existing upgrade-tier chaos-to-PR-head (HC1) mechanism, then add the quick flow +
units + a live proof on the custom-html canonical (the deliberately-fail-restores-known-good case is
also the WC9 rollback-proof preview).

View File

@ -23,10 +23,10 @@ nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversa
- [x] **WC2** — Data-warm canonical model: per-recipe canonical at stable domain `warm-<recipe>`,
declarative registry (canonical.json + recipe_meta.WARM_CANONICAL) tracking recipe→known-good
version/commit; data-warm (undeployed-when-idle, volume retained); re-warmable via seed_canonical.
Proven on custom-html (W1.2). **CLAIMED — see Gate below.**
Proven on custom-html (W1.2). **Adversary PASS @2026-05-29** (REVIEW-2w 0246296, gate 4ce80f8).
- [x] **WC3** — Known-good snapshots: raw per-volume tar taken while undeployed under
`/var/lib/ci-warm/<recipe>/snapshot/`; one last-good per app, atomic subdir swap; restore
round-trips data (W0.5 mutate→restore proof + W1.2 data-warm reattach). **CLAIMED — see Gate.**
round-trips data (W0.5 + W1.2 + Adversary's own mutate→restore). **Adversary PASS @2026-05-29**.
- [ ] **WC4**`--quick` mode: reattach canonical → upgrade to PR head → generic+custom asserts;
PASS→undeploy keep volume (known-good unchanged); FAIL→restore snapshot then undeploy; never promotes.
- [ ] **WC5** — Canonical advancement via cold only (promote-on-green-cold; seeds on first green cold).
@ -40,7 +40,8 @@ nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversa
## Milestones (plan §3)
- **W0** — Warm keycloak (WC1/WC1.1-keycloak/WC1.2). ✅ Adversary PASS @2026-05-29.
- **W1** — Canonical registry + snapshot/restore (WC2, WC3). ← IN FLIGHT
- **W1** — Canonical registry + snapshot/restore (WC2, WC3). ✅ Adversary PASS @2026-05-29.
- **W2** — `--quick` mode (WC4, WC7). ← IN FLIGHT
- **W1** — Canonical registry + snapshot/restore (WC2, WC3).
- **W2** — `--quick` mode (WC4, WC7).
- **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6).
@ -87,25 +88,26 @@ nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversa
**W0 COMPLETE — Adversary PASS @2026-05-29.** Now in **W1 (canonical registry, WC2/WC3)**.
**W1 progress:** W1.1 canonical registry module DONE (b6ef83a) — `runner/harness/canonical.py`
(enrollment via recipe_meta.WARM_CANONICAL, registry canonical.json, deploy/undeploy-keep-volume/
seed lifecycle) + 4 unit tests (61 unit pass). **Next: W1.2** — enroll custom-html
(`tests/custom-html/recipe_meta.py: WARM_CANONICAL=True`) + LIVE data-warm proof: seed a
warm-custom-html canonical with content → undeploy-keep-volume (verify volume retained, app down) →
deploy_canonical (reattach) → assert the written content survives; re-warmable from scratch. Then
close WC2/WC3.
**W0 ✅ + W1 ✅ Adversary PASS. Now in W2 (`--quick` mode, WC4 + WC7).**
**W1 plan (WC2 data-warm canonical model + WC3 closure):**
- WC2: a declarative **canonical registry** — which recipes are canonical + at which known-good
commit/version — with each canonical app at a **stable domain `warm-<recipe>`**, kept **data-warm**
(undeployed-when-idle, data volume retained). Re-warmable from scratch (cache). Reconciler/registry
declared in-repo.
- WC3: snapshots (warmsnap, W0.5 — done) tied to canonicals: one last-good per canonical under
`/var/lib/ci-warm/<recipe>/`, restore proven (done). Close WC3 with the canonical model.
- Distinguish from W0's live-warm keycloak: canonicals are DATA-warm (undeployed when idle), keycloak
is LIVE-warm (always up). Both use the `warm-<recipe>` stable scheme.
**W2 plan (`--quick` opt-in fast lane — plan §2 reference flow):**
- Add a `--quick` path to `runner/run_recipe_ci.py` (env `CCCI_QUICK=1` / `MODE=quick`): PRECOND a
canonical exists (canonical.has_canonical); else clean fallback (run COLD + report "no canonical").
1. `canonical.deploy_canonical(recipe)` — reattach the warm volume → fast warm boot at known-good.
2. wait_healthy. 3. (deps) point at warm keycloak + per-run realm (reuse the dep wiring).
4. **UPGRADE to PR head** (chaos redeploy of the canonical to the PR checkout) — the op, once.
5. assert: generic UPGRADE (reconverge + moved + serving) + recipe overlay + custom (requires_deps);
generic-first invariant holds.
6a. PASS → `canonical.undeploy_keep_volume` (known-good UNCHANGED — NEVER promote).
6b. FAIL → `warmsnap.restore` (last-known-good) → undeploy (roll back, data safe).
7. (deps) delete the per-run realm.
- WC7: default `!testme` = full cold (unchanged); `--quick` opt-in, NEVER gates merge; run results
carry the **mode** (cold|quick) so a quick pass is labelled lower-confidence; no-canonical fallback.
- Build: study run_recipe_ci.py upgrade tier + lifecycle chaos path; add unit tests + a live `--quick`
proof on the custom-html canonical (PASS keeps known-good; deliberately-fail restores it = the WC9
rollback proof preview).
**Tracked before Phase-2w DONE (not blocking W1):**
**Tracked before Phase-2w DONE:**
- **W0.10a — traefik WC1.1** (Adversary requires a cold proof): migrate `proxy.nix` onto the shared
health-gated reconciler (stateless = version-rollback-only; preserve cert-secret/WILDCARDS_ENABLED/
COMPOSE_FILE setup). CAREFUL — traefik serves all TLS; deploy/test only in a quiet window.
@ -119,7 +121,12 @@ headline e2e is green (below). No recipe/harness change needed.
## Gate
### Gate: WC2 + WC3 — CLAIMED, awaiting Adversary (@2026-05-29, HEAD = see `git log -1`)
### Gate: WC2 + WC3 — Adversary PASS @2026-05-29 (REVIEW-2w 0246296, gate 4ce80f8)
Cold-verified from the Adversary's own clone (its own data-warm round-trip + restore round-trip).
Builder may proceed to W2 (`--quick`). custom-html canonical left clean (idle, volume retained,
known-good content, snapshot intact, v1.11.0+1.29.0). (claim detail retained below.)
### (claimed, now PASS) Gate: WC2 + WC3 — CLAIMED detail
**WHAT.** The data-warm canonical model (W1): a declarative per-recipe canonical at the stable domain
`warm-<recipe>.ci.commoninternet.net`, kept **data-warm** (undeployed-when-idle, data volume