# REVIEW — phase pvcheck (post-proxy verification) Adversary-owned. Append-only verdicts. All commands run cold from /srv/cc-ci-orch/cc-ci-adv (own clone). --- ## Adversary baseline probe — 2026-06-13T05:56Z **Context:** Phase pvfix is DONE (STATUS-pvfix.md ## DONE). pvcheck preconditions verified cold. ### Precondition checks | Check | Result | |---|---| | pvfix DONE | ✅ STATUS-pvfix.md shows `## DONE`, both M1+M2 PASS | | `proxy` subnet | ✅ `10.10.0.0/16` (docker network inspect proxy --format "{{range .IPAM.Config}}{{.Subnet}}{{end}}") | | `proxy` IPAM driver | ✅ default, gateway 10.10.0.1 | | All services 1/1 | ✅ 9 services all `1/1` (backups, bridge, dashboard, reports, drone, traefik×2, keycloak×2) | | `ci.commoninternet.net` | ✅ HTTP/2 200 | | `drone.ci.commoninternet.net` | ✅ HTTP/2 303 | | `report.ci.commoninternet.net` | ✅ HTTP/2 200 | | VIP exhaustion after 05:38Z | ✅ NONE — `journalctl -u docker --since "2026-06-13 05:38:00" | grep "available IP while allocating VIP"` → empty | | Transient errors at 05:35Z | ℹ️ "could not find network allocator STATE" for OLD net IDs (mlxau8…, 85p3aq…) — these are expected during proxy recreation (swarm allocator losing state for the deleted /24 network) | | No new VIP exhaustion | ✅ post-fix journal clean | **Command evidence:** ``` $ docker network inspect proxy --format "{{json .IPAM}}" {"Driver":"default","Options":null,"Config":[{"Subnet":"10.10.0.0/16","Gateway":"10.10.0.1"}]} $ docker service ls --format "{{.Name}}\t{{.Replicas}}" backups_ci_commoninternet_net_app 1/1 ccci-bridge_app 1/1 ccci-dashboard_app 1/1 ccci-reports_app 1/1 drone_ci_commoninternet_net_app 1/1 traefik_ci_commoninternet_net_app 1/1 traefik_ci_commoninternet_net_socket-proxy 1/1 warm-keycloak_ci_commoninternet_net_app 1/1 warm-keycloak_ci_commoninternet_net_db 1/1 ``` ### Upgrade-all Step-0 guard — independent check **Guard location:** `/srv/cc-ci-orch/.claude/skills/upgrade-all/SKILL.md` §0, lines 61-81 **Guard logic:** `VIPFAIL=$(ssh cc-ci 'journalctl -u docker --since "26 hours ago" | grep -c "available IP while allocating VIP"')` → if >0, `systemctl restart docker` **Guard exists:** ✅ confirmed cold-read **Guard would fire:** ✅ triggers on the EXACT original error signature (`"available IP while allocating VIP"`) — would detect and recover if VIP exhaustion recurs despite the /16 fix (belt+suspenders) **STALE TEXT NOTE:** Skill still says "(The durable fix ... is tracked in plan-proxy-vip-exhaustion-fix.md; this guard is the per-run safety net until that lands.)" — but the durable fix HAS now landed. This is a documentation smell, not a functional defect; the guard logic is correct and still useful. Filing as advisory finding [A2]. --- ## Adversary independent allocator-headroom probe — 2026-06-13T06:02Z **Method:** deploy 5 throwaway nginx stacks concurrently joining `proxy`, then remove all 5 concurrently (same concurrent-rm pattern that caused endpoint GC races under the old /24). | Check | Result | |---|---| | BASELINE proxy containers | 9 | | AFTER DEPLOY (5 stacks added) | 14 | | AFTER concurrent stack rm | 9 (back to baseline) | | Leaked endpoints | **0** | | VIP exhaustion errors during test | **0** | | Swarm GC race errors (key modified / network proxy remove failed) | **0** | | Network prune output | empty (nothing to reclaim) | | AFTER prune residue | **0** | | All pvcheck-throwaway stacks removed | ✅ confirmed | **Verdict:** The /16 subnet has sufficient headroom that 5 concurrent deploy/rm cycles produce zero endpoint leaks and zero VIP errors. No residue after prune. **Note:** 5 stacks is a conservative test — the original exhaustion required ~45 GC races over 11 days uptime. The /16 has 65534 VIPs vs the old /24's 254 — the leak rate would need to be ~258× faster to hit the same ceiling. This probe confirms the allocator is healthy and the /16 provides the claimed headroom. --- ## M1 — PASS @2026-06-13T06:10Z **Cold verify run — Adversary's own commands, no cached state.** | Check | Command | Result | |---|---|---| | proxy subnet | `docker network inspect proxy --format "Subnet: {{range .IPAM.Config}}{{.Subnet}}{{end}}, Endpoints: {{len .Containers}}"` | **`10.10.0.0/16`, Endpoints: 7** ✅ | | 9 services 1/1 | `docker service ls --format "{{.Name}}\t{{.Replicas}}"` | all 1/1 ✅ | | ci.commoninternet.net | `curl -sk -o /dev/null -w "%{http_code}"` | **200** ✅ | | drone.ci.commoninternet.net | same | **303** ✅ | | report.ci.commoninternet.net | same | **200** ✅ | | VIP exhaustion since 05:38Z | `journalctl -u docker --since "2026-06-13 05:38:00" \| grep -c "available IP while allocating VIP"` | **0** ✅ | | swarm.nix /16 declared | `grep "10.10" nix/modules/swarm.nix` | `--subnet 10.10.0.0/16` ✅ | | swarm.nix commit | `git show e6349a9 --stat` | confirmed ✅ | | Step-0 guard text | `grep -A8 "VIPFAIL" upgrade-all/SKILL.md` | guard exists, checks exact signature ✅ | | [A2] fix | `git -C /srv/cc-ci-orch log --oneline \| grep 84e13a7` | `fix(pvcheck/A2): update upgrade-all SKILL.md guard description` ✅ | | [A2] text updated | SKILL.md line ~81 | "belt-and-suspenders even after the /16 fix" ✅ | **All M1 criteria verified independently from cold start.** Builder's before/after evidence is consistent with what Adversary observed directly. No discrepancies. [A2] CLOSED — fix confirmed in orchestrator commit 84e13a7. ## M2 — PENDING (awaiting Builder claim) Real recipe CI run AFTER the proxy fix (05:38Z) still needed. Dashboard shows run #585 (ghost, ~04:56Z) was before the fix — a new !testme run post-fix is required for M2. Adversary independent allocator-headroom probe already completed (2026-06-13T06:02Z — see above): 5 concurrent stacks, 0 leaks, 0 VIP errors. Awaiting Builder's full headroom proof + real recipe run claim. --- ## Adversary findings ### [A2] upgrade-all SKILL.md stale description — guard text still says "until that lands" (2026-06-13T05:56Z) **Severity:** Documentation / low **Location:** `/srv/cc-ci-orch/.claude/skills/upgrade-all/SKILL.md` line 81 **Current text:** "this guard is the per-run safety net until that lands" **Issue:** the durable fix (proxy /16) has landed — this text now misleads about the guard's purpose (it IS still useful as belt+suspenders, but no longer "until the fix lands") **Suggested fix:** update to "this guard remains as belt-and-suspenders even after the /16 subnet fix" **NOT a VETO** — guard logic is correct; this is documentation only. Status: open (Builder may fix; Adversary closes after re-read)