ideas: ALT infra-app model — traefik/keycloak/drone as normal coop-cloud abra deployments, maintainer-updated outside Nix (parked, operator-flagged)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-29 00:12:15 +01:00
parent 00e90bb597
commit 9f99b134cd

View File

@ -4,6 +4,32 @@ Post-DONE or "revisit later" ideas that are intentionally **out of scope** for t
(§2 Definition of Done). Not active work — parked here so they aren't lost. The loops may pull an
item into the project `BACKLOG.md` as `[idea]` if/when it becomes relevant.
- [ ] **ALT infra-app model: deploy traefik / warm-keycloak / drone the normal Co-op Cloud way,
updated outside Nix via abra by maintainers.** *(operator-flagged 2026-05-29, alternative to the
current Phase-2w design — return to later)*
The **current** design Nix-*reconciles* the infra/warm apps: NixOS systemd oneshots run `abra` to
deploy traefik/keycloak/drone, and a nightly `nixos-rebuild` auto-updates them to latest with a
pre-deploy major/manual-migration gate (WC1.2) + post-deploy health-gated rollback (WC1.1).
**The alternative:** treat the CI server like a normal Co-op Cloud host — traefik, the warm
keycloak, and drone are **plain abra deployments managed by maintainers**, deployed once and
**upgraded by a human via `abra app upgrade`** (using abra's own release-notes / major-bump
caution), with **no Nix reconciler and no nightly auto-update** for them. Nix would provide only
the host substrate (OS, docker/swarm, the harness, secrets), not the infra-app lifecycle.
- *Pros:* simpler — no reconciler/rollback/nightly machinery; idiomatic Co-op Cloud ops (maintainers
manage these exactly like any other coop-cloud app); updates are normal human abra actions; lower
cognitive load (no "infra is special / Nix-driven" layer).
- *Cons:* the infra apps are **no longer reproducible-from-git** — a VM recreate (the D8 throwaway
rebuild) would NOT re-establish traefik/keycloak/drone; they'd need manual redeploy after a
rebuild (D8 then covers only the host + harness, not the infra apps). Loses the automated
nightly-latest + health-gated auto-rollback; infra updates + rollback become manual/operator
discipline. Drifts from the project's "declarative, rebuildable from scratch" ethos.
- *Note:* it's essentially one-or-the-other for the **update path** — a hybrid where Nix bootstraps
them but maintainers also `abra app upgrade` them creates the dual-ownership conflict (the Nix
reconciler would fight/redeploy over the maintainer's version on the next activation).
- *When to revisit:* if the reconciler + rollback + nightly machinery proves high-maintenance or
brittle, or if maintainers strongly prefer normal coop-cloud workflow over the Nix layer — weigh
that against how much we value full reproducibility (D8) + hands-off auto-updates. *Added:* 2026-05-29.
- [ ] **Optional `--extra-tests` flag for heavy / operational tests (opt-in heavy suite).**
Some recipe tests are "more than needed" for the default CI signal — state-management /
long-running-instance / load / helper-script operational tests that don't fit the ephemeral