From 0a32854853eca9a496a5eb67647c32fff3e5607c Mon Sep 17 00:00:00 2001 From: autonomic-bot Date: Sat, 13 Jun 2026 13:45:25 +0000 Subject: [PATCH] =?UTF-8?q?review(pxgate-M2):=20PASS=20=E2=80=94=20cold-bo?= =?UTF-8?q?ot=20sim=20confirms=20cycle=20broken,=20proxy=20active=20withou?= =?UTF-8?q?t=20dashboard?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit nixos-rebuild deployed fix; new nix store path 8qjh8apxcbs85 with /api/version probe; deploy-proxy active(exited) at 13:43:15 UTC; cold-boot sim: proxy started active(exited) with dashboard stopped; all 9 services 1/1; alert dir empty; rollback gate unchanged. Phase pxgate DoD fully met. Builder may write ## DONE. --- machine-docs/REVIEW-pxgate.md | 69 +++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/machine-docs/REVIEW-pxgate.md b/machine-docs/REVIEW-pxgate.md index 7f75908..1027d94 100644 --- a/machine-docs/REVIEW-pxgate.md +++ b/machine-docs/REVIEW-pxgate.md @@ -219,3 +219,72 @@ Confirmed: old probe still live in active nix store path (km6173hm5a77wxggd7zba3 **STATUS-pxgate.md M2 pre-check:** builder-clone on cc-ci must be pulled to ≥ `0e9fd38` before nixos-rebuild. Current: `caef217` (stale). Orchestrator must `cd /root/builder-clone && git pull` first. No new findings warranting a VETO. All running-system probes PASS. + +--- + +## M2 — Proven on a real nixos-rebuild + +### PASS @2026-06-13T13:44Z — Adversary cold-verified + +nixos-rebuild completed (detected by Adversary at ~13:43:15 UTC — new nix store path appeared on deploy-proxy). Full M2 acceptance run executed independently. + +#### Check 1 — deploy-proxy active (exited) after nixos-rebuild ✅ + +``` +Active: active (exited) since Sat 2026-06-13 13:43:15 UTC +Invocation: fe8a806fbb5b40239c31a5c48f381cd1 +Process: 3171211 ExecStart=/nix/store/8qjh8apxcbs85asgizkymjskicf4zmsl-cc-ci-reconcile-proxy/bin/cc-ci-reconcile-proxy (code=exited, status=0/SUCCESS) +``` + +No alert written. New nix store path `8qjh8apxcbs85asgizkymjskicf4zmsl` — different from old `km6173hm5a77wxggd7zba3mfakrz0c94`. + +#### Check 2 — `/api/version` probe in new nix store path ✅ + +New runner: `/nix/store/5hic3aba65i88m1ib67b7g6dwzrzd1z2-runner/warm_reconcile.py` + +Traefik spec confirmed: +```python +"traefik": { + "recipe": "traefik", + "domain": "traefik.ci.commoninternet.net", + "health_path": "/api/version", # ← new probe + "health_ok": (200,), + ... +} +``` +`health_domain` key absent → probe URL = `https://traefik.ci.commoninternet.net/api/version` (no backend/dashboard dep). Source grep confirms the inline comment: "traefik's OWN /api/version endpoint (no backend/dashboard dependency)". + +#### Check 3 — All services 1/1 (running server unaffected) ✅ + +All 9 Docker services 1/1 after nixos-rebuild: +`backups`, `ccci-bridge`, `ccci-dashboard`, `ccci-reports`, `drone`, `traefik_app`, `traefik_socket-proxy`, `warm-keycloak_app`, `warm-keycloak_db`. + +Dashboard (`https://ci.commoninternet.net/`) → 200. `/api/version` → 200. + +#### Check 4 — Cold-boot simulation: proxy starts without dashboard ✅ + +Adversary executed the definitive cold-boot simulation (STATUS-pxgate.md Check 5): + +``` +1. systemctl stop deploy-dashboard → inactive ✓ +2. systemctl stop deploy-proxy && systemctl reset-failed deploy-proxy +3. systemctl start deploy-proxy + → Active: active (exited) since Sat 2026-06-13 13:44:01 UTC ✓ + → Process: ExecStart=.../8qjh8apxcbs85asgizkymjskicf4zmsl-cc-ci-reconcile-proxy ... (status=0/SUCCESS) +4. systemctl start deploy-dashboard → active (exited) ✓ +5. All services 1/1; dashboard → 200; /api/version → 200 ✓ +``` + +**Deploy-proxy reached `active (exited)` with the dashboard not running — cycle conclusively broken.** The old probe (ci.commoninternet.net/) would have timed out at 300s (health_timeout) trying to reach a dashboard that wasn't started yet. + +#### Check 5 — Alert directory empty ✅ + +`/var/lib/ci-warm/alerts/` empty after both the nixos-rebuild run and the cold-boot simulation. No unhealthy alert written — new probe returned 200 on first health check. + +#### Check 6 — Rollback path (code-proof, unchanged) ✅ + +`health_code()` unchanged: returns `int(r.stdout.strip() or "0")` → 0 on curl failure → 0 ∉ (200,) → `wait_healthy()` returns False → rollback triggered. Gate has teeth. (Confirmed same as M1.) + +--- + +**M2 VERDICT: PASS** — nixos-rebuild deployed the fix; deploy-proxy active without deadlock; cold-boot simulation confirmed cycle broken; all services unaffected; rollback intact. Phase pxgate Definition of Done fully met. Builder may write ## DONE.