diff --git a/machine-docs/JOURNAL-2w.md b/machine-docs/JOURNAL-2w.md index 5ffc17e..aadd5c1 100644 --- a/machine-docs/JOURNAL-2w.md +++ b/machine-docs/JOURNAL-2w.md @@ -250,3 +250,20 @@ bounded). Did NOT enroll a 2nd recipe yet (custom-html suffices for W2 --quick + Parked at the W1 gate. While awaiting: will do non-disruptive W0.10b (alert-relay) — NOT the traefik W0.10a migration (it disrupts TLS the Adversary needs to verify the data-warm round-trip through). + +## 2026-05-29 — W1 gate WC2+WC3 ADVERSARY PASS; advancing to W2 (--quick) + +Adversary cold-verified WC2+WC3 from its own clone (REVIEW-2w 0246296): 61 units; its OWN data-warm +round-trip (deploy→write ADV marker→undeploy-keep-volume→redeploy→marker survived, Builder's known-good +also reattached); its OWN WC3 restore round-trip (mutate→restore→exact known-good content back, +mutation gone). Its 2 crashes were its own driver-script bugs, not product defects. Canonical left +clean. **WC2 + WC3 PASS @2026-05-29.** Same coordination lag as the W0 claim (its watchdog pinged on a +pre-claim read; resolved via ADVERSARY-INBOX). traefik WC1.1 (W0.10a) remains the sole tracked-open +before DONE. + +**Advancing to W2 (--quick, WC4+WC7).** Design: a `--quick` opt-in path in run_recipe_ci.py that +consumes the canonical (reattach → upgrade-to-PR-head → assert → PASS keep-volume / FAIL +restore-snapshot, NEVER promote), tagging results mode=quick, with a clean no-canonical fallback to +cold. Will study the existing upgrade-tier chaos-to-PR-head (HC1) mechanism, then add the quick flow + +units + a live proof on the custom-html canonical (the deliberately-fail-restores-known-good case is +also the WC9 rollback-proof preview). diff --git a/machine-docs/STATUS-2w.md b/machine-docs/STATUS-2w.md index 106f7f7..0f597c6 100644 --- a/machine-docs/STATUS-2w.md +++ b/machine-docs/STATUS-2w.md @@ -23,10 +23,10 @@ nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversa - [x] **WC2** — Data-warm canonical model: per-recipe canonical at stable domain `warm-`, declarative registry (canonical.json + recipe_meta.WARM_CANONICAL) tracking recipe→known-good version/commit; data-warm (undeployed-when-idle, volume retained); re-warmable via seed_canonical. - Proven on custom-html (W1.2). **CLAIMED — see Gate below.** + Proven on custom-html (W1.2). **Adversary PASS @2026-05-29** (REVIEW-2w 0246296, gate 4ce80f8). - [x] **WC3** — Known-good snapshots: raw per-volume tar taken while undeployed under `/var/lib/ci-warm//snapshot/`; one last-good per app, atomic subdir swap; restore - round-trips data (W0.5 mutate→restore proof + W1.2 data-warm reattach). **CLAIMED — see Gate.** + round-trips data (W0.5 + W1.2 + Adversary's own mutate→restore). **Adversary PASS @2026-05-29**. - [ ] **WC4** — `--quick` mode: reattach canonical → upgrade to PR head → generic+custom asserts; PASS→undeploy keep volume (known-good unchanged); FAIL→restore snapshot then undeploy; never promotes. - [ ] **WC5** — Canonical advancement via cold only (promote-on-green-cold; seeds on first green cold). @@ -40,7 +40,8 @@ nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversa ## Milestones (plan §3) - **W0** — Warm keycloak (WC1/WC1.1-keycloak/WC1.2). ✅ Adversary PASS @2026-05-29. -- **W1** — Canonical registry + snapshot/restore (WC2, WC3). ← IN FLIGHT +- **W1** — Canonical registry + snapshot/restore (WC2, WC3). ✅ Adversary PASS @2026-05-29. +- **W2** — `--quick` mode (WC4, WC7). ← IN FLIGHT - **W1** — Canonical registry + snapshot/restore (WC2, WC3). - **W2** — `--quick` mode (WC4, WC7). - **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6). @@ -87,25 +88,26 @@ nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversa **W0 COMPLETE — Adversary PASS @2026-05-29.** Now in **W1 (canonical registry, WC2/WC3)**. -**W1 progress:** W1.1 canonical registry module DONE (b6ef83a) — `runner/harness/canonical.py` -(enrollment via recipe_meta.WARM_CANONICAL, registry canonical.json, deploy/undeploy-keep-volume/ -seed lifecycle) + 4 unit tests (61 unit pass). **Next: W1.2** — enroll custom-html -(`tests/custom-html/recipe_meta.py: WARM_CANONICAL=True`) + LIVE data-warm proof: seed a -warm-custom-html canonical with content → undeploy-keep-volume (verify volume retained, app down) → -deploy_canonical (reattach) → assert the written content survives; re-warmable from scratch. Then -close WC2/WC3. +**W0 ✅ + W1 ✅ Adversary PASS. Now in W2 (`--quick` mode, WC4 + WC7).** -**W1 plan (WC2 data-warm canonical model + WC3 closure):** -- WC2: a declarative **canonical registry** — which recipes are canonical + at which known-good - commit/version — with each canonical app at a **stable domain `warm-`**, kept **data-warm** - (undeployed-when-idle, data volume retained). Re-warmable from scratch (cache). Reconciler/registry - declared in-repo. -- WC3: snapshots (warmsnap, W0.5 — done) tied to canonicals: one last-good per canonical under - `/var/lib/ci-warm//`, restore proven (done). Close WC3 with the canonical model. -- Distinguish from W0's live-warm keycloak: canonicals are DATA-warm (undeployed when idle), keycloak - is LIVE-warm (always up). Both use the `warm-` stable scheme. +**W2 plan (`--quick` opt-in fast lane — plan §2 reference flow):** +- Add a `--quick` path to `runner/run_recipe_ci.py` (env `CCCI_QUICK=1` / `MODE=quick`): PRECOND a + canonical exists (canonical.has_canonical); else clean fallback (run COLD + report "no canonical"). + 1. `canonical.deploy_canonical(recipe)` — reattach the warm volume → fast warm boot at known-good. + 2. wait_healthy. 3. (deps) point at warm keycloak + per-run realm (reuse the dep wiring). + 4. **UPGRADE to PR head** (chaos redeploy of the canonical to the PR checkout) — the op, once. + 5. assert: generic UPGRADE (reconverge + moved + serving) + recipe overlay + custom (requires_deps); + generic-first invariant holds. + 6a. PASS → `canonical.undeploy_keep_volume` (known-good UNCHANGED — NEVER promote). + 6b. FAIL → `warmsnap.restore` (last-known-good) → undeploy (roll back, data safe). + 7. (deps) delete the per-run realm. +- WC7: default `!testme` = full cold (unchanged); `--quick` opt-in, NEVER gates merge; run results + carry the **mode** (cold|quick) so a quick pass is labelled lower-confidence; no-canonical fallback. +- Build: study run_recipe_ci.py upgrade tier + lifecycle chaos path; add unit tests + a live `--quick` + proof on the custom-html canonical (PASS keeps known-good; deliberately-fail restores it = the WC9 + rollback proof preview). -**Tracked before Phase-2w DONE (not blocking W1):** +**Tracked before Phase-2w DONE:** - **W0.10a — traefik WC1.1** (Adversary requires a cold proof): migrate `proxy.nix` onto the shared health-gated reconciler (stateless = version-rollback-only; preserve cert-secret/WILDCARDS_ENABLED/ COMPOSE_FILE setup). CAREFUL — traefik serves all TLS; deploy/test only in a quiet window. @@ -119,7 +121,12 @@ headline e2e is green (below). No recipe/harness change needed. ## Gate -### Gate: WC2 + WC3 — CLAIMED, awaiting Adversary (@2026-05-29, HEAD = see `git log -1`) +### Gate: WC2 + WC3 — ✅ Adversary PASS @2026-05-29 (REVIEW-2w 0246296, gate 4ce80f8) +Cold-verified from the Adversary's own clone (its own data-warm round-trip + restore round-trip). +Builder may proceed to W2 (`--quick`). custom-html canonical left clean (idle, volume retained, +known-good content, snapshot intact, v1.11.0+1.29.0). (claim detail retained below.) + +### (claimed, now PASS) Gate: WC2 + WC3 — CLAIMED detail **WHAT.** The data-warm canonical model (W1): a declarative per-recipe canonical at the stable domain `warm-.ci.commoninternet.net`, kept **data-warm** (undeployed-when-idle, data volume