Operator decision (2026-05-28): traefik + keycloak stay UNPINNED (fetch latest + chaos deploy);
a nightly `nixos-rebuild switch` rolls them to latest, then the full-cold sweep runs. The nix
closure stays byte-identical (recipe fetched at runtime, not in the store) so D8 holds.
Health-gated rollback is built INTO the reconciler (not nix-generation rollback, since the swarm
app isn't in the generation): record last-good -> deploy latest -> health-check -> commit or
roll back + PushNotification. Stateful apps (keycloak): snapshot the data volume before upgrade
(undeploy->snapshot->deploy-latest) and restore it on rollback, reusing the WC3 snapshot helper;
traefik = version rollback only. Added WC1.1 + updated WC1/WC6/milestones/guardrails/decisions.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Operator-directed: pause Phase 2, build the warm-data + --quick system, then resume Phase 2.
- live-warm keycloak (SSO dep, realm-per-run), data-warm canonicals (undeploy keeps volume),
cold = authoritative default. --quick reattaches the canonical, upgrades to PR head, asserts,
and rolls back to the last-known-good snapshot on failure (never loses working data).
- known-good = raw volume copy taken while undeployed (consistent), one per app, advanced ONLY
by green cold runs; a nightly full-cold sweep refreshes canonicals + is a daily regression run.
- launch.sh: insert 2w at the current index (Phase 2 -> resumes after 2w DONE); seq is now
1c 1b 1d 1e 2w 2 2b 3 4.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>