claim(2w): W0.10a traefik WC1.1 migrated onto shared health-gated reconciler — no-op converge proven; destructive rollback = Adversary cold proof
warm_reconcile.py: per-spec setup hook + health_domain; SPECS[traefik] (stateful=False, version-rollback-only, _traefik_setup preserves wildcard-cert/ file-provider config, health on routed dashboard host). keycloak path unchanged. proxy.nix: deploy-proxy.service now execs warm_reconcile.py traefik. ZERO-disruption migration (traefik already at latest 5.1.1+v3.6.15; pre-seeded TYPE+last_good → clean no-op converge; traefik 200 + keycloak-through-traefik 200 + 0 failed). 65 unit pass. Per operator out: code+converge delivered; destructive rollback (brief TLS blip) = Adversary's required cold proof. Closes the W0.10a tracked-open. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -308,3 +308,24 @@ Plan for next: (a) W0.10a traefik health-gated reconciler migration (quiet windo
|
||||
serves all TLS); (b) W3 WC5 promote-on-green-cold (extend cold-run teardown to re-seed the canonical
|
||||
on green-latest, reusing seed_canonical); (c) W3 WC6 nightly sweep (systemd timer: rebuild-then-cold-
|
||||
sweep). traefik first (use the window) or interleave; W0.10b alert-relay is a small loop step.
|
||||
|
||||
## 2026-05-29 — W0.10a traefik WC1.1 migrated (quiet window) — code + no-op converge; rollback = Adversary proof
|
||||
|
||||
Used the post-W2 quiet window (Adversary idle) for the tracked traefik WC1.1 migration. Generalized
|
||||
warm_reconcile.py: per-spec `setup` hook + `health_domain`; added SPECS["traefik"] (stateful=False →
|
||||
stateless version-rollback-only, NO snapshot; setup=_traefik_setup preserving the wildcard-cert/
|
||||
file-provider config EXACTLY via the proven newline-safe abra.env_set; health on the routed dashboard
|
||||
host). keycloak's path is unchanged (no `setup` key → default). proxy.nix migrated:
|
||||
deploy-proxy.service now execs `warm_reconcile.py traefik` (runner/ packaged in the store, D8-clean).
|
||||
|
||||
ZERO-DISRUPTION migration: traefik was already at the latest tag (5.1.1+v3.6.15, image v3.6.15, chaos
|
||||
commit 005f023 = the tag commit). I pre-seeded the .env TYPE + last_good to 5.1.1+v3.6.15 (accurate —
|
||||
traefik IS at that version), so the health-gated reconcile is a clean no-op (current==latest==healthy)
|
||||
→ NO redeploy, NO TLS blip. Verified via nixos-rebuild switch: deploy-proxy.service → "no-op",
|
||||
traefik 200 + keycloak-through-traefik 200 + 0 failed units. 65 unit pass.
|
||||
|
||||
Per the operator's explicit out (a destructive traefik test risks ALL TLS), I delivered the code +
|
||||
safe no-op converge and left the DESTRUCTIVE rollback as the Adversary's required cold proof (staged
|
||||
broken traefik tag → reconcile → rollback to last-good, brief TLS blip + manual recovery ready). The
|
||||
rollback logic is the proven keycloak pattern, stateless variant. Claiming W0.10a so the Adversary
|
||||
runs that cold proof. After this clears, WC1.1 is fully closed (keycloak + traefik).
|
||||
|
||||
@ -15,9 +15,10 @@ nightly full-cold sweep. Definition of Done = WC1–WC9 (plan §1), each Adversa
|
||||
- [x] **WC1** — Live-warm UNPINNED keycloak; per-run namespaced realms (create+delete); concurrent
|
||||
distinct realms; orphan realms reaped. **Adversary PASS @2026-05-29** (REVIEW-2w, gate 985686f).
|
||||
- [~] **WC1.1** — Health-gated deploy-with-rollback. **keycloak (stateful) — Adversary PASS
|
||||
@2026-05-29** (marquee: broken latest → snapshot→restore→prior, data intact, last_good held,
|
||||
alert). **traefik (stateless, version-rollback-only) — NOT yet migrated = W0.10**, MUST close
|
||||
before Phase-2w DONE (Adversary will require a cold proof).
|
||||
@2026-05-29** (marquee). **traefik (stateless, version-rollback-only) — reconciler MIGRATED
|
||||
(W0.10a): proxy.nix now drives `warm_reconcile.py traefik` (shared health-gated path, no
|
||||
snapshot; cert/file-provider setup preserved); no-op converge proven live (traefik 200,
|
||||
keycloak-through-traefik 200, 0 failed). CLAIMED — destructive rollback = Adversary cold proof.**
|
||||
- [x] **WC1.2** — Pre-deploy safety gate (major / manual-migration → hold + alert with notes, no
|
||||
churn, short-circuits before WC1.1). **Adversary PASS @2026-05-29**.
|
||||
- [x] **WC2** — Data-warm canonical model: per-recipe canonical at stable domain `warm-<recipe>`,
|
||||
@ -125,6 +126,38 @@ headline e2e is green (below). No recipe/harness change needed.
|
||||
|
||||
## Gate
|
||||
|
||||
### Gate: W0.10a traefik WC1.1 — CLAIMED, awaiting Adversary (@2026-05-29)
|
||||
|
||||
**WHAT.** traefik migrated onto the shared health-gated reconciler (WC1.1, stateless =
|
||||
version-rollback-only, NO snapshot): record last-good → deploy latest tag → health-gate (routed host
|
||||
ci.commoninternet.net = 200) → healthy commit / unhealthy roll back to last-good + alert. Closes the
|
||||
W0.10a tracked-open item from the W0 gate. traefik's wildcard-cert/file-provider config preserved.
|
||||
|
||||
**WHERE.** `runner/warm_reconcile.py` (SPECS["traefik"] stateful=False + `_traefik_setup` + health_domain;
|
||||
reconcile() per-app setup hook; the stateless path skips snapshot/restore — version rollback only),
|
||||
`nix/modules/proxy.nix` (deploy-proxy.service now execs `python3 …/warm_reconcile.py traefik`).
|
||||
|
||||
**HOW + EXPECTED (cold):**
|
||||
1. **Units:** `cc-ci-run -m pytest tests/unit -q` → **65 passed** (incl. test_warm_reconcile traefik
|
||||
spec: stateful=False, callable setup, health_domain=ci.commoninternet.net; keycloak unchanged).
|
||||
2. **No-op converge (delivered, proven live):** `systemctl is-active deploy-proxy.service` → active;
|
||||
`journalctl -u deploy-proxy.service` → `[traefik] already on latest 5.1.1+v3.6.15 and healthy —
|
||||
no-op`; traefik serving (ci.commoninternet.net=200) + keycloak-through-traefik=200 + system
|
||||
`running` (0 failed). The migration was zero-disruption (traefik was already at the latest tag; I
|
||||
pre-seeded TYPE+last_good to 5.1.1+v3.6.15 so the reconcile is a clean no-op).
|
||||
3. **Destructive rollback (the Adversary's required cold proof):** stage a fake newer traefik tag with
|
||||
a broken config → `CCCI_SKIP_FETCH=1 cc-ci-run runner/warm_reconcile.py traefik` → broken deploy
|
||||
fails health → reconciler rolls back to last-good 5.1.1+v3.6.15 (version-only, no snapshot — traefik
|
||||
is stateless) → traefik healthy again + a `*-rollback.json` alert. NOTE: a destructive traefik test
|
||||
briefly drops TLS for ALL routes during the broken-deploy window until rollback — run it knowing
|
||||
that + with manual recovery ready (`abra app deploy traefik.ci.commoninternet.net 5.1.1+v3.6.15
|
||||
-o -n -f`). The rollback logic is the SAME proven keycloak pattern, stateless variant (no snapshot).
|
||||
|
||||
Per operator guidance, I delivered the code + the safe no-op converge this iteration and left the
|
||||
destructive rollback as the Adversary's cold proof (a live destructive traefik test risks all TLS).
|
||||
|
||||
---
|
||||
|
||||
### Gate: WC4 + WC7 — ✅ Adversary PASS @2026-05-29 (REVIEW-2w 31f0e42, gate 3ff2bf6)
|
||||
Cold-verified from the Adversary's own clone: 64 units; WC7 adversarial trigger battery (all negatives
|
||||
rejected, live bridge); WC4 never-promote (snapshot byte-identical, registry unchanged); WC4
|
||||
|
||||
Reference in New Issue
Block a user