status+journal(2w): W2 gate WC4+WC7 ADVERSARY PASS @2026-05-29; advance to W3 (WC5/WC6) + traefik W0.10a quiet window

This commit is contained in:
2026-05-29 03:34:29 +01:00
parent 31f0e426c4
commit aec6911c68
2 changed files with 44 additions and 22 deletions

View File

@ -291,3 +291,20 @@ doing so. Claiming WC4+WC7 now with that prefix.
System clean post-rebuild: keycloak 200, custom-html canonical idle@1.11.0+1.29.0, 0 failed units,
disk 50%. Parked at the W2 gate; next quiet-window work = W0.10a traefik WC1.1 migration.
## 2026-05-29 — W2 gate WC4+WC7 ADVERSARY PASS; advancing to W3 (+ traefik quiet window)
Adversary cold-verified WC4+WC7 (REVIEW-2w 31f0e42): 64 units; WC7 adversarial trigger battery
(all negatives rejected on the live bridge); WC4 never-promote (snapshot byte-identical sha256
9ef62bdf, registry unchanged); WC4 FAIL→rollback restored EXACT known-good (marker back, app 200,
broken image gone, exit 1 — "WC9 rollback-proof in miniature"); no-canonical fallback to a cold
per-run domain (canonical untouched). No tests softened. **WC4+WC7 PASS @2026-05-29.**
Three of four milestones now PASS (W0, W1, W2). Advancing to W3 (WC5 promote-on-green-cold + WC6
nightly sweep). ALSO: the Adversary is now idle (post-W2), so this is the QUIET WINDOW for the
tracked W0.10a traefik WC1.1 migration (it disrupts TLS, so it must NOT overlap an Adversary verify).
Plan for next: (a) W0.10a traefik health-gated reconciler migration (quiet window, careful — traefik
serves all TLS); (b) W3 WC5 promote-on-green-cold (extend cold-run teardown to re-seed the canonical
on green-latest, reusing seed_canonical); (c) W3 WC6 nightly sweep (systemd timer: rebuild-then-cold-
sweep). traefik first (use the window) or interleave; W0.10b alert-relay is a small loop step.

View File

@ -30,13 +30,13 @@ nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversa
- [x] **WC4**`--quick` mode (`run_quick` in run_recipe_ci.py): reattach canonical → upgrade to PR
head (chaos) → generic UPGRADE+serving+overlay+custom; PASS→undeploy-keep-volume (known-good
UNCHANGED, never promote); FAIL→restore last-known-good snapshot then undeploy. Proven live on
custom-html (PASS + FAIL). **CLAIMED — see Gate.**
custom-html (PASS + FAIL). **Adversary PASS @2026-05-29** (REVIEW-2w 31f0e42, gate 3ff2bf6).
- [ ] **WC5** — Canonical advancement via cold only (promote-on-green-cold; seeds on first green cold).
- [ ] **WC6** — Nightly full-cold sweep (scheduled, declarative, MAX_TESTS-bounded).
- [x] **WC7** — Trigger/authority/labeling: default `!testme`=cold (unchanged); `--quick` opt-in via
bridge `parse_trigger` (`!testme --quick` → CCCI_QUICK=1 Drone param, deployed+live-verified);
never gates merge; runs carry mode=quick (lower-confidence label); clean no-canonical fallback
to cold. **CLAIMED — see Gate.**
to cold. **Adversary PASS @2026-05-29** (REVIEW-2w 31f0e42, gate 3ff2bf6).
- [ ] **WC8** — Resource safety + isolation: warm runs serialize per app; warm keycloak shared via
per-run realms; disk monitored+pruned; cold teardown sacred; warm data excluded from D8 closure.
- [ ] **WC9** — Docs + cold verify incl. the rollback proof (deliberately fail a PR under `--quick`,
@ -45,8 +45,8 @@ nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversa
## Milestones (plan §3)
- **W0** — Warm keycloak (WC1/WC1.1-keycloak/WC1.2). ✅ Adversary PASS @2026-05-29.
- **W1** — Canonical registry + snapshot/restore (WC2, WC3). ✅ Adversary PASS @2026-05-29.
- **W2** — `--quick` mode (WC4, WC7). ← CLAIMED, awaiting Adversary
- **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6).
- **W2** — `--quick` mode (WC4, WC7). ✅ Adversary PASS @2026-05-29.
- **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6). ← IN FLIGHT
- **W1** — Canonical registry + snapshot/restore (WC2, WC3).
- **W2** — `--quick` mode (WC4, WC7).
- **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6).
@ -93,24 +93,23 @@ nightly full-cold sweep. Definition of Done = WC1WC9 (plan §1), each Adversa
**W0 COMPLETE — Adversary PASS @2026-05-29.** Now in **W1 (canonical registry, WC2/WC3)**.
**W0 ✅ + W1 ✅ Adversary PASS. Now in W2 (`--quick` mode, WC4 + WC7).**
**W0 ✅ + W1 ✅ + W2 ✅ Adversary PASS. Now in W3 (cold-advances-canonical WC5 + nightly sweep WC6).**
**W2 plan (`--quick` opt-in fast lane — plan §2 reference flow):**
- Add a `--quick` path to `runner/run_recipe_ci.py` (env `CCCI_QUICK=1` / `MODE=quick`): PRECOND a
canonical exists (canonical.has_canonical); else clean fallback (run COLD + report "no canonical").
1. `canonical.deploy_canonical(recipe)` — reattach the warm volume → fast warm boot at known-good.
2. wait_healthy. 3. (deps) point at warm keycloak + per-run realm (reuse the dep wiring).
4. **UPGRADE to PR head** (chaos redeploy of the canonical to the PR checkout) — the op, once.
5. assert: generic UPGRADE (reconverge + moved + serving) + recipe overlay + custom (requires_deps);
generic-first invariant holds.
6a. PASS`canonical.undeploy_keep_volume` (known-good UNCHANGED — NEVER promote).
6b. FAIL → `warmsnap.restore` (last-known-good) → undeploy (roll back, data safe).
7. (deps) delete the per-run realm.
- WC7: default `!testme` = full cold (unchanged); `--quick` opt-in, NEVER gates merge; run results
carry the **mode** (cold|quick) so a quick pass is labelled lower-confidence; no-canonical fallback.
- Build: study run_recipe_ci.py upgrade tier + lifecycle chaos path; add unit tests + a live `--quick`
proof on the custom-html canonical (PASS keeps known-good; deliberately-fail restores it = the WC9
rollback proof preview).
**W3 plan:**
- **WC5 — promote-on-green-cold.** A GREEN full-cold run on the LATEST (not a `--quick` run) of an
enrolled (WARM_CANONICAL) recipe re-snapshots + re-tags the canonical known-good instead of
deleting the volume at teardown: at the end of a green cold run, undeploy → `canonical.seed_canonical`
(snapshot while undeployed + write registry version=the green commit/version) → keep the volume as
the new canonical. The FIRST green cold run on latest SEEDS the canonical. ONLY cold advances it
(`--quick` never promotes — proven W2). Wire into run_recipe_ci.py cold teardown, gated on:
recipe is WARM_CANONICAL + run was green + deployed LATEST (not a pinned/prev base). Add unit
tests + a live proof (green cold custom-html run → canonical re-seeded at the new known-good).
- **WC6 — nightly full-cold sweep.** Declarative scheduler (systemd timer on cc-ci): nightly does
`nixos-rebuild switch` FIRST (rolls warm/infra to latest, health-gated per WC1.1) THEN a full-cold
sweep across enrolled recipes (serial, MAX_TESTS-bounded), refreshing each canonical's known-good
(WC5) + serving as the daily authoritative regression. MUST NOT run while a test is in flight.
- **Quiet-window opportunity (now): W0.10a traefik WC1.1** — Adversary idle post-W2 PASS, so this is
the window to migrate traefik onto the health-gated reconciler (tracked-before-DONE; below).
**Tracked before Phase-2w DONE:**
- **W0.10a — traefik WC1.1** (Adversary requires a cold proof): migrate `proxy.nix` onto the shared
@ -126,7 +125,13 @@ headline e2e is green (below). No recipe/harness change needed.
## Gate
### Gate: WC4 + WC7 — CLAIMED, awaiting Adversary (@2026-05-29, HEAD = see `git log -1`)
### Gate: WC4 + WC7 — Adversary PASS @2026-05-29 (REVIEW-2w 31f0e42, gate 3ff2bf6)
Cold-verified from the Adversary's own clone: 64 units; WC7 adversarial trigger battery (all negatives
rejected, live bridge); WC4 never-promote (snapshot byte-identical, registry unchanged); WC4
FAIL→rollback restored EXACT known-good (marker back, 200, broken image gone, exit 1); no-canonical
fallback to a cold per-run domain. Builder may proceed to W3. (claim detail retained below.)
### (claimed, now PASS) Gate: WC4 + WC7 — CLAIMED detail
**WHAT.** The `--quick` opt-in fast lane (W2): reattach the data-warm canonical → upgrade in place to
the PR head → assert (generic upgrade reconverge+moved+serving + overlay + custom); PASS →