From 3ff2bf6c48c21c2eee4bb5155013f80c4be443b5 Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Fri, 29 May 2026 03:17:29 +0100 Subject: [PATCH] =?UTF-8?q?claim(2w):=20Gate=20WC4+WC7=20CLAIMED=20?= =?UTF-8?q?=E2=80=94=20--quick=20fast=20lane=20proven=20live=20(PASS=20kee?= =?UTF-8?q?ps=20known-good,=20FAIL=20restores)=20+=20bridge=20!testme=20--?= =?UTF-8?q?quick=20deployed?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit WC4 run_quick: reattach canonical → upgrade-to-PR-head → assert → PASS undeploy-keep-volume (known-good UNCHANGED, never promote) / FAIL restore last-known-good snapshot + undeploy. Live PASS+FAIL proof on custom-html: ALL PASS (canonical left clean idle@1.11.0+1.29.0). WC7: bridge parse_trigger (!testme / !testme --quick / reject !testmexyz) → CCCI_QUICK param, deployed + live-verified; default !testme stays cold; never gates merge; mode-labeled; no-canonical fallback to cold. 64 unit pass. Full HOW/EXPECTED/WHERE in STATUS-2w. Co-Authored-By: Claude Opus 4.8 (1M context) --- machine-docs/JOURNAL-2w.md | 24 +++++++++++++++++ machine-docs/STATUS-2w.md | 55 ++++++++++++++++++++++++++++++++++---- 2 files changed, 74 insertions(+), 5 deletions(-) diff --git a/machine-docs/JOURNAL-2w.md b/machine-docs/JOURNAL-2w.md index aadd5c1..71308a0 100644 --- a/machine-docs/JOURNAL-2w.md +++ b/machine-docs/JOURNAL-2w.md @@ -267,3 +267,27 @@ restore-snapshot, NEVER promote), tagging results mode=quick, with a clean no-ca cold. Will study the existing upgrade-tier chaos-to-PR-head (HC1) mechanism, then add the quick flow + units + a live proof on the custom-html canonical (the deliberately-fail-restores-known-good case is also the WC9 rollback-proof preview). + +## 2026-05-29 — W2 (--quick, WC4+WC7) built + proven live; claiming gate + +WC4 run_quick in run_recipe_ci.py (dispatch on CCCI_QUICK=1/MODE=quick when a canonical exists, else +clean cold fallback). Live PASS+FAIL proof on the custom-html canonical (ALL PASS): PASS run +(upgrade→different-healthy-head) leaves known-good UNCHANGED + idle + volume/data intact; FAIL run +(broken-image head) rolls back — undeploy→restore last-known-good→idle, known-good UNCHANGED, data +intact. 3 bugs found+fixed by the live proof (missing `import time` crashed the rollback; stale .env +TYPE from a prior --quick upgrade pointing at a removed PR commit FATAL'd abra — deploy_canonical + +rollback now reset TYPE to the known-good). + +WC7 trigger surface: bridge `parse_trigger` accepts `!testme` (cold) / `!testme --quick` (opt-in), +rejects `!testmexyz` etc.; threads CCCI_QUICK=1 through trigger_build (auto-exposed Drone param); +quick PR comment labelled lower-confidence; default !testme unchanged; never gates merge. +Deployed via nixos-rebuild (content-tagged bridge image rolled) + LIVE-verified in the running +container (parse_trigger correct, healthz 200). 64 unit pass. + +Handoff-signalling note (orchestrator): the watchdog now pings off COMMIT PREFIXES on origin/main +(`claim(...)` pings Adversary; `review(...)` pings Builder), not prose — which caused the earlier +premature "no formal gate" dances. I already use `claim(2w):` for gate claims + push promptly; keep +doing so. Claiming WC4+WC7 now with that prefix. + +System clean post-rebuild: keycloak 200, custom-html canonical idle@1.11.0+1.29.0, 0 failed units, +disk 50%. Parked at the W2 gate; next quiet-window work = W0.10a traefik WC1.1 migration. diff --git a/machine-docs/STATUS-2w.md b/machine-docs/STATUS-2w.md index 0f597c6..9b70738 100644 --- a/machine-docs/STATUS-2w.md +++ b/machine-docs/STATUS-2w.md @@ -27,12 +27,16 @@ nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversa - [x] **WC3** — Known-good snapshots: raw per-volume tar taken while undeployed under `/var/lib/ci-warm//snapshot/`; one last-good per app, atomic subdir swap; restore round-trips data (W0.5 + W1.2 + Adversary's own mutate→restore). **Adversary PASS @2026-05-29**. -- [ ] **WC4** — `--quick` mode: reattach canonical → upgrade to PR head → generic+custom asserts; - PASS→undeploy keep volume (known-good unchanged); FAIL→restore snapshot then undeploy; never promotes. +- [x] **WC4** — `--quick` mode (`run_quick` in run_recipe_ci.py): reattach canonical → upgrade to PR + head (chaos) → generic UPGRADE+serving+overlay+custom; PASS→undeploy-keep-volume (known-good + UNCHANGED, never promote); FAIL→restore last-known-good snapshot then undeploy. Proven live on + custom-html (PASS + FAIL). **CLAIMED — see Gate.** - [ ] **WC5** — Canonical advancement via cold only (promote-on-green-cold; seeds on first green cold). - [ ] **WC6** — Nightly full-cold sweep (scheduled, declarative, MAX_TESTS-bounded). -- [ ] **WC7** — Trigger/authority/labeling: default `!testme`=cold; `--quick` opt-in, never gates merge; - results carry mode; clean no-canonical fallback. +- [x] **WC7** — Trigger/authority/labeling: default `!testme`=cold (unchanged); `--quick` opt-in via + bridge `parse_trigger` (`!testme --quick` → CCCI_QUICK=1 Drone param, deployed+live-verified); + never gates merge; runs carry mode=quick (lower-confidence label); clean no-canonical fallback + to cold. **CLAIMED — see Gate.** - [ ] **WC8** — Resource safety + isolation: warm runs serialize per app; warm keycloak shared via per-run realms; disk monitored+pruned; cold teardown sacred; warm data excluded from D8 closure. - [ ] **WC9** — Docs + cold verify incl. the rollback proof (deliberately fail a PR under `--quick`, @@ -41,7 +45,8 @@ nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversa ## Milestones (plan §3) - **W0** — Warm keycloak (WC1/WC1.1-keycloak/WC1.2). ✅ Adversary PASS @2026-05-29. - **W1** — Canonical registry + snapshot/restore (WC2, WC3). ✅ Adversary PASS @2026-05-29. -- **W2** — `--quick` mode (WC4, WC7). ← IN FLIGHT +- **W2** — `--quick` mode (WC4, WC7). ← CLAIMED, awaiting Adversary +- **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6). - **W1** — Canonical registry + snapshot/restore (WC2, WC3). - **W2** — `--quick` mode (WC4, WC7). - **W3** — Cold-advances-canonical + nightly sweep (WC5, WC6). @@ -121,6 +126,46 @@ headline e2e is green (below). No recipe/harness change needed. ## Gate +### Gate: WC4 + WC7 — CLAIMED, awaiting Adversary (@2026-05-29, HEAD = see `git log -1`) + +**WHAT.** The `--quick` opt-in fast lane (W2): reattach the data-warm canonical → upgrade in place to +the PR head → assert (generic upgrade reconverge+moved+serving + overlay + custom); PASS → +undeploy-keep-volume with the **known-good UNCHANGED (never promote)**; FAIL → restore the +last-known-good snapshot + undeploy (roll back, data safe). Opt-in via `!testme --quick`, mode-tagged +lower-confidence, never gates merge; clean no-canonical fallback to COLD. + +**WHERE (code).** `runner/run_recipe_ci.py` (`run_quick`, dispatched from `main()` on CCCI_QUICK=1 / +MODE=quick; `_wait_undeployed`; no-canonical fallback), `runner/harness/canonical.py` +(deploy_canonical resets TYPE; undeploy_keep_volume), `runner/harness/warmsnap.py` (restore), +`bridge/bridge.py` (`parse_trigger` + CCCI_QUICK param), `.drone.yml` (quick echo). 64 unit pass. + +**HOW + EXPECTED (cold, from your own clone on cc-ci):** +1. **Units:** `cc-ci-run -m pytest tests/unit -q` → **64 passed** (incl. test_bridge_trigger: + `!testme`→cold, `!testme --quick`→quick, `!testmexyz`→reject). +2. **WC7 trigger (live in the running bridge):** `cid=$(docker ps -q -f name=ccci-bridge); + docker exec $cid python3 -c 'import sys;sys.path.insert(0,"/app");import bridge; + print(bridge.parse_trigger("!testme --quick"), bridge.parse_trigger("!testmexyz"))'` → + `(True, True) (False, False)`. `trigger_build` adds `CCCI_QUICK=1` (auto-exposed to run_recipe_ci); + a `!testme --quick` PR comment is labelled lower-confidence; plain `!testme` stays full cold. +3. **WC4 `--quick` flow (custom-html canonical, currently idle at 1.11.0+1.29.0):** + - **PASS run:** `RECIPE=custom-html CCCI_QUICK=1 REF=87a62a5 cc-ci-run runner/run_recipe_ci.py` + (REF=87a62a5 is the 1.10.0+1.28.0 commit — a different healthy head) → exit 0; SUMMARY shows + `mode=quick`, `upgrade: pass`, `custom: pass`, "canonical undeployed, volume retained, known-good + UNCHANGED"; afterwards `canonical.json` version STILL 1.11.0+1.29.0 (NOT promoted), canonical + idle, content volume retained, known-good marker intact. + - **FAIL run (rollback):** stage a broken custom-html commit (`image: nginx:99.99.99-doesnotexist`), + `RECIPE=custom-html CCCI_QUICK=1 CCCI_SKIP_FETCH=1 REF= cc-ci-run + runner/run_recipe_ci.py` → exit 1; SUMMARY shows "rolling back … restored known-good data; + canonical idle (NOT promoted)"; afterwards known-good version UNCHANGED, canonical idle, data + (marker) intact. Builder ran both live: **ALL PASS** (canonical left clean idle@1.11.0+1.29.0). + - **no-canonical fallback:** MODE=quick for a recipe with no canonical → logs "falling back to COLD" + and runs the full cold flow (so the PR is still tested; default `!testme` unaffected). + +**Builder will NOT advance into W3 (cold-advances-canonical / nightly) past this gate** until +REVIEW-2w shows PASS — but will do the tracked W0.10a (traefik) in a quiet window meanwhile. + +--- + ### Gate: WC2 + WC3 — ✅ Adversary PASS @2026-05-29 (REVIEW-2w 0246296, gate 4ce80f8) Cold-verified from the Adversary's own clone (its own data-warm round-trip + restore round-trip). Builder may proceed to W2 (`--quick`). custom-html canonical left clean (idle, volume retained,