diff --git a/machine-docs/JOURNAL-2w.md b/machine-docs/JOURNAL-2w.md index 71308a0..bc72bd5 100644 --- a/machine-docs/JOURNAL-2w.md +++ b/machine-docs/JOURNAL-2w.md @@ -291,3 +291,20 @@ doing so. Claiming WC4+WC7 now with that prefix. System clean post-rebuild: keycloak 200, custom-html canonical idle@1.11.0+1.29.0, 0 failed units, disk 50%. Parked at the W2 gate; next quiet-window work = W0.10a traefik WC1.1 migration. + +## 2026-05-29 — W2 gate WC4+WC7 ADVERSARY PASS; advancing to W3 (+ traefik quiet window) + +Adversary cold-verified WC4+WC7 (REVIEW-2w 31f0e42): 64 units; WC7 adversarial trigger battery +(all negatives rejected on the live bridge); WC4 never-promote (snapshot byte-identical sha256 +9ef62bdf, registry unchanged); WC4 FAIL→rollback restored EXACT known-good (marker back, app 200, +broken image gone, exit 1 — "WC9 rollback-proof in miniature"); no-canonical fallback to a cold +per-run domain (canonical untouched). No tests softened. **WC4+WC7 PASS @2026-05-29.** + +Three of four milestones now PASS (W0, W1, W2). Advancing to W3 (WC5 promote-on-green-cold + WC6 +nightly sweep). ALSO: the Adversary is now idle (post-W2), so this is the QUIET WINDOW for the +tracked W0.10a traefik WC1.1 migration (it disrupts TLS, so it must NOT overlap an Adversary verify). + +Plan for next: (a) W0.10a traefik health-gated reconciler migration (quiet window, careful — traefik +serves all TLS); (b) W3 WC5 promote-on-green-cold (extend cold-run teardown to re-seed the canonical +on green-latest, reusing seed_canonical); (c) W3 WC6 nightly sweep (systemd timer: rebuild-then-cold- +sweep). traefik first (use the window) or interleave; W0.10b alert-relay is a small loop step. diff --git a/machine-docs/STATUS-2w.md b/machine-docs/STATUS-2w.md index 9b70738..8a970db 100644 --- a/machine-docs/STATUS-2w.md +++ b/machine-docs/STATUS-2w.md @@ -30,13 +30,13 @@ nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversa - [x] **WC4** — `--quick` mode (`run_quick` in run_recipe_ci.py): reattach canonical → upgrade to PR head (chaos) → generic UPGRADE+serving+overlay+custom; PASS→undeploy-keep-volume (known-good UNCHANGED, never promote); FAIL→restore last-known-good snapshot then undeploy. Proven live on - custom-html (PASS + FAIL). **CLAIMED — see Gate.** + custom-html (PASS + FAIL). **Adversary PASS @2026-05-29** (REVIEW-2w 31f0e42, gate 3ff2bf6). - [ ] **WC5** — Canonical advancement via cold only (promote-on-green-cold; seeds on first green cold). - [ ] **WC6** — Nightly full-cold sweep (scheduled, declarative, MAX_TESTS-bounded). - [x] **WC7** — Trigger/authority/labeling: default `!testme`=cold (unchanged); `--quick` opt-in via bridge `parse_trigger` (`!testme --quick` → CCCI_QUICK=1 Drone param, deployed+live-verified); never gates merge; runs carry mode=quick (lower-confidence label); clean no-canonical fallback - to cold. **CLAIMED — see Gate.** + to cold. **Adversary PASS @2026-05-29** (REVIEW-2w 31f0e42, gate 3ff2bf6). - [ ] **WC8** — Resource safety + isolation: warm runs serialize per app; warm keycloak shared via per-run realms; disk monitored+pruned; cold teardown sacred; warm data excluded from D8 closure. - [ ] **WC9** — Docs + cold verify incl. the rollback proof (deliberately fail a PR under `--quick`, @@ -45,8 +45,8 @@ nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversa ## Milestones (plan §3) - **W0** — Warm keycloak (WC1/WC1.1-keycloak/WC1.2). ✅ Adversary PASS @2026-05-29. - **W1** — Canonical registry + snapshot/restore (WC2, WC3). ✅ Adversary PASS @2026-05-29. -- **W2** — `--quick` mode (WC4, WC7). ← CLAIMED, awaiting Adversary -- **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6). +- **W2** — `--quick` mode (WC4, WC7). ✅ Adversary PASS @2026-05-29. +- **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6). ← IN FLIGHT - **W1** — Canonical registry + snapshot/restore (WC2, WC3). - **W2** — `--quick` mode (WC4, WC7). - **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6). @@ -93,24 +93,23 @@ nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversa **W0 COMPLETE — Adversary PASS @2026-05-29.** Now in **W1 (canonical registry, WC2/WC3)**. -**W0 ✅ + W1 ✅ Adversary PASS. Now in W2 (`--quick` mode, WC4 + WC7).** +**W0 ✅ + W1 ✅ + W2 ✅ Adversary PASS. Now in W3 (cold-advances-canonical WC5 + nightly sweep WC6).** -**W2 plan (`--quick` opt-in fast lane — plan §2 reference flow):** -- Add a `--quick` path to `runner/run_recipe_ci.py` (env `CCCI_QUICK=1` / `MODE=quick`): PRECOND a - canonical exists (canonical.has_canonical); else clean fallback (run COLD + report "no canonical"). - 1. `canonical.deploy_canonical(recipe)` — reattach the warm volume → fast warm boot at known-good. - 2. wait_healthy. 3. (deps) point at warm keycloak + per-run realm (reuse the dep wiring). - 4. **UPGRADE to PR head** (chaos redeploy of the canonical to the PR checkout) — the op, once. - 5. assert: generic UPGRADE (reconverge + moved + serving) + recipe overlay + custom (requires_deps); - generic-first invariant holds. - 6a. PASS → `canonical.undeploy_keep_volume` (known-good UNCHANGED — NEVER promote). - 6b. FAIL → `warmsnap.restore` (last-known-good) → undeploy (roll back, data safe). - 7. (deps) delete the per-run realm. -- WC7: default `!testme` = full cold (unchanged); `--quick` opt-in, NEVER gates merge; run results - carry the **mode** (cold|quick) so a quick pass is labelled lower-confidence; no-canonical fallback. -- Build: study run_recipe_ci.py upgrade tier + lifecycle chaos path; add unit tests + a live `--quick` - proof on the custom-html canonical (PASS keeps known-good; deliberately-fail restores it = the WC9 - rollback proof preview). +**W3 plan:** +- **WC5 — promote-on-green-cold.** A GREEN full-cold run on the LATEST (not a `--quick` run) of an + enrolled (WARM_CANONICAL) recipe re-snapshots + re-tags the canonical known-good instead of + deleting the volume at teardown: at the end of a green cold run, undeploy → `canonical.seed_canonical` + (snapshot while undeployed + write registry version=the green commit/version) → keep the volume as + the new canonical. The FIRST green cold run on latest SEEDS the canonical. ONLY cold advances it + (`--quick` never promotes — proven W2). Wire into run_recipe_ci.py cold teardown, gated on: + recipe is WARM_CANONICAL + run was green + deployed LATEST (not a pinned/prev base). Add unit + tests + a live proof (green cold custom-html run → canonical re-seeded at the new known-good). +- **WC6 — nightly full-cold sweep.** Declarative scheduler (systemd timer on cc-ci): nightly does + `nixos-rebuild switch` FIRST (rolls warm/infra to latest, health-gated per WC1.1) THEN a full-cold + sweep across enrolled recipes (serial, MAX_TESTS-bounded), refreshing each canonical's known-good + (WC5) + serving as the daily authoritative regression. MUST NOT run while a test is in flight. +- **Quiet-window opportunity (now): W0.10a traefik WC1.1** — Adversary idle post-W2 PASS, so this is + the window to migrate traefik onto the health-gated reconciler (tracked-before-DONE; below). **Tracked before Phase-2w DONE:** - **W0.10a — traefik WC1.1** (Adversary requires a cold proof): migrate `proxy.nix` onto the shared @@ -126,7 +125,13 @@ headline e2e is green (below). No recipe/harness change needed. ## Gate -### Gate: WC4 + WC7 — CLAIMED, awaiting Adversary (@2026-05-29, HEAD = see `git log -1`) +### Gate: WC4 + WC7 — ✅ Adversary PASS @2026-05-29 (REVIEW-2w 31f0e42, gate 3ff2bf6) +Cold-verified from the Adversary's own clone: 64 units; WC7 adversarial trigger battery (all negatives +rejected, live bridge); WC4 never-promote (snapshot byte-identical, registry unchanged); WC4 +FAIL→rollback restored EXACT known-good (marker back, 200, broken image gone, exit 1); no-canonical +fallback to a cold per-run domain. Builder may proceed to W3. (claim detail retained below.) + +### (claimed, now PASS) Gate: WC4 + WC7 — CLAIMED detail **WHAT.** The `--quick` opt-in fast lane (W2): reattach the data-warm canonical → upgrade in place to the PR head → assert (generic upgrade reconverge+moved+serving + overlay + custom); PASS →