review(pvfix-M1): M1 PASS — patch + procedure verified cold
Some checks failed
continuous-integration/drone/push Build is failing
Some checks failed
continuous-integration/drone/push Build is failing
Patch: swarm.nix line 47 adds --subnet 10.10.0.0/16 correctly. Safety: live host full subnet table confirms 10.10.0.0/16 clear. Procedure: service names verified against host, sequencing sound, backups stack correctly excluded, nixos-rebuild will restart swarm-init. Non-blocking note: explicit systemctl restart swarm-init recommended as belt-and-braces after nixos-rebuild.
This commit is contained in:
@ -16,6 +16,80 @@ Cold check of live host and current repo:
|
||||
|
||||
The fix is needed. Watching for Builder M1 claim (patch + procedure + live inspection proof).
|
||||
|
||||
### Break-it probe: live host subnet collision check (2026-06-13T05:31Z)
|
||||
|
||||
Existing subnets on host:
|
||||
- `ingress`: `10.0.0.0/24`
|
||||
- `proxy` (current): `10.0.1.0/24`
|
||||
- `docker0`: `172.17.0.0/16`
|
||||
- `docker_gwbridge`: `172.18.0.0/16`
|
||||
- Host IP: `91.98.47.73` (public), `100.95.31.88` (tailscale), gateway `172.31.1.1`
|
||||
|
||||
**10.10.0.0/16 (proposed):** does NOT collide with any existing subnet. Safe.
|
||||
|
||||
Services currently on proxy (will be disrupted during recreation):
|
||||
- `traefik` → 10.0.1.9
|
||||
- `ccci-reports` → 10.0.1.7
|
||||
- `drone` → 10.0.1.12
|
||||
- `ccci-bridge` → 10.0.1.248
|
||||
- `ccci-dashboard` → 10.0.1.249
|
||||
- `warm-keycloak` → 10.0.1.251
|
||||
|
||||
Stacks currently running (all will briefly lose routing):
|
||||
`backups`, `ccci-bridge`, `ccci-dashboard`, `ccci-reports`, `drone`, `traefik`, `warm-keycloak`
|
||||
|
||||
**Maintenance window status:** CLEAR — no active recipe test stacks (`*-pr*`), no cfold sweep,
|
||||
no /upgrade-all visible. A quiet window is available now.
|
||||
|
||||
**Key risk to probe when M2 is claimed:** confirm that after proxy recreation, all 6 services
|
||||
above rejoin with healthy VIP allocations and Traefik routes are reachable end-to-end.
|
||||
|
||||
---
|
||||
|
||||
<!-- verdicts appended below as Builder gates are claimed -->
|
||||
## M1: PASS @2026-06-13T05:33Z
|
||||
|
||||
**Claim:** `nix/modules/swarm.nix` patched with `--subnet 10.10.0.0/16`; maintenance procedure
|
||||
documented; chosen /16 proven safe from live host inspection.
|
||||
**Commit:** `e6349a9` (`claim(pvfix-M1): proxy /16 patch + maintenance plan ready`)
|
||||
|
||||
### Cold-run evidence
|
||||
|
||||
**1. Patch in repo:**
|
||||
```
|
||||
grep -n 'subnet' nix/modules/swarm.nix
|
||||
→ 47: docker network create --driver overlay --attachable --subnet 10.10.0.0/16 proxy
|
||||
```
|
||||
Correct. The `if ! docker network inspect proxy` guard ensures idempotent create. Comment
|
||||
accurately names the failure mode and runbook. ✓
|
||||
|
||||
**2. Subnet safety — live host inspection:**
|
||||
```
|
||||
docker network inspect $(docker network ls -q) --format "{{.Name}}: {{range .IPAM.Config}}{{.Subnet}}{{end}}"
|
||||
→
|
||||
backups_ci_commoninternet_net_default: 10.0.4.0/24
|
||||
bridge: 172.17.0.0/16
|
||||
docker_gwbridge: 172.18.0.0/16
|
||||
host: (none)
|
||||
ingress: 10.0.0.0/24
|
||||
none: (none)
|
||||
proxy: 10.0.1.0/24
|
||||
traefik_ci_commoninternet_net_internal: 10.0.2.0/24
|
||||
warm-keycloak_ci_commoninternet_net_internal: 10.0.3.0/24
|
||||
```
|
||||
Builder's table matches exactly. `10.10.0.0/16` is clear of all existing networks. ✓
|
||||
|
||||
**3. Maintenance procedure review:**
|
||||
- **Service names confirmed correct** against live host:
|
||||
`deploy-proxy`, `deploy-drone`, `deploy-bridge`, `deploy-dashboard`, `deploy-reports`,
|
||||
`warm-keycloak` — all exist as active oneshot services. ✓
|
||||
- **backups stack correctly excluded** — `backups_ci_commoninternet_net_default` (10.0.4.0/24)
|
||||
is NOT on `proxy` (confirmed via proxy Containers inspection). ✓
|
||||
- **Step sequencing is safe:** stack rm → drain wait → network rm → nixos-rebuild (triggers
|
||||
swarm-init with new --subnet) → restart deploy services. ✓
|
||||
- **nixos-rebuild will restart swarm-init:** `swarm-init.service` unit script changed (added
|
||||
--subnet flag); nixos-rebuild switch calls daemon-reload + restart for changed units. ✓
|
||||
- **Note (non-blocking recommendation):** Builder may want to add an explicit
|
||||
`systemctl restart swarm-init` after nixos-rebuild as belt-and-braces insurance (in case
|
||||
daemon-reload timing is unusual). Not required for correctness but eliminates any ambiguity.
|
||||
|
||||
**M1 PASS — safe to execute the maintenance procedure.** Waiting for Builder M2 claim.
|
||||
|
||||
Reference in New Issue
Block a user