Commit Graph

3 Commits

Author SHA1 Message Date
00e90bb597 plan(2w): WC1.2 — pre-deploy auto-upgrade safety gate (major/manual-migration -> alert, hold)
Operator (2026-05-29): a passing health check does NOT prove a required manual migration ran, so
auto-update needs a PRE-deploy gate in addition to the post-deploy health rollback. Reconciler
auto-applies only non-major (patch/minor) upgrades with no manual-migration release notes; a MAJOR
recipe-version bump (or release notes flagging a manual migration) is held on the current version
with a PushNotification carrying the release notes (operator upgrades manually). Leans on abra's
own major-bump caution + recipe releaseNotes/. Updated WC1.2/WC6/principles/decisions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 00:02:28 +01:00
c3a572e4b9 plan(2w): warm/infra auto-latest nightly + health-gated rollback (snapshot stateful apps)
Operator decision (2026-05-28): traefik + keycloak stay UNPINNED (fetch latest + chaos deploy);
a nightly `nixos-rebuild switch` rolls them to latest, then the full-cold sweep runs. The nix
closure stays byte-identical (recipe fetched at runtime, not in the store) so D8 holds.
Health-gated rollback is built INTO the reconciler (not nix-generation rollback, since the swarm
app isn't in the generation): record last-good -> deploy latest -> health-check -> commit or
roll back + PushNotification. Stateful apps (keycloak): snapshot the data volume before upgrade
(undeploy->snapshot->deploy-latest) and restore it on rollback, reusing the WC3 snapshot helper;
traefik = version rollback only. Added WC1.1 + updated WC1/WC6/milestones/guardrails/decisions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 23:59:16 +01:00
a2728eec2d plan: Phase 2w — warm canonical deployments + --quick CI mode (interjected into Phase 2)
Operator-directed: pause Phase 2, build the warm-data + --quick system, then resume Phase 2.
- live-warm keycloak (SSO dep, realm-per-run), data-warm canonicals (undeploy keeps volume),
  cold = authoritative default. --quick reattaches the canonical, upgrades to PR head, asserts,
  and rolls back to the last-known-good snapshot on failure (never loses working data).
- known-good = raw volume copy taken while undeployed (consistent), one per app, advanced ONLY
  by green cold runs; a nightly full-cold sweep refreshes canonicals + is a daily regression run.
- launch.sh: insert 2w at the current index (Phase 2 -> resumes after 2w DONE); seq is now
  1c 1b 1d 1e 2w 2 2b 3 4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-28 23:04:33 +01:00