Orchestrator decision: deploy canonical coop-cloud traefik via abra instead of a hand-rolled module. abra packaged in Nix (pinned). custom-html deployed over HTTPS (200) via the gateway and torn down clean. docs/install.md seeded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4.9 KiB
DECISIONS — cc-ci Builder
Architecture decisions and dead-ends. One line of rationale each. (§0, §8)
Settled
-
Wildcard TLS: operator pre-issues wildcard cert at
/var/lib/ci-certs/live/; Traefik file provider serves it; no ACME for commoninternet.net. (Plan §4.0/§8 — fixed.) -
Repo:
git.autonomic.zone/recipe-maintainers/cc-ci, private. Bot is org admin. (Bootstrap.) -
Git credentials: helper script in repo-local git config sources
/srv/cc-ci/.testenvat call time — no secret values stored in.git/configor commits. -
Proxy: real coop-cloud/traefik via abra — SETTLED (M1, orchestrator decision 2026-05-26, overrides plan §3
modules/traefik.nix). Instead of a hand-rolled Traefik we deploy the canonical Co-op Cloudtraefikrecipe via abra in wildcard / file-provider mode, for end-to-end fidelity (canonicalweb/web-secureentrypoints + proxy/swarm conventions every recipe expects — this also fixed an entrypoint-name mismatch the custom build hit). NO ACME, NO DNS token on the box:WILDCARDS_ENABLED=1+ appendcompose.wildcard.yml; the pre-issued cert is fed as thessl_cert/ssl_keyswarm secrets (v1) viaabra app secret insert … -ffrom/var/lib/ci-certs/live/{fullchain,privkey}.pem. The file provider serves it (tls.certificates).LETS_ENCRYPT_ENV=empty on the traefik app and on every test app → the recipe'stls.certresolver=${LETS_ENCRYPT_ENV}label resolves to no resolver → routers serve the wildcard via SNI from the file provider, ACME never fires. (Verified: 0 ACME log lines.)- Reproducibility (D8):
scripts/deploy-proxy.shis idempotent (ensures local abra server, fetches recipe, writes the wildcard/no-ACME env, inserts cert secrets, deploys). Documented indocs/install.md. The custommodules/traefik.nixwas removed;modules/swarm.nixkeeps swarm init +proxynet + firewall 80/443. - Renewal (manual, ~90d): operator re-issues the wildcard at the same paths, then
abra app secret rm traefik.ci.commoninternet.net ssl_cert -n+ re-insert at a new version (bumpSECRET_WILDCARD_CERT_VERSION) and redeploy. (Documented in docs/secrets.md at M7.) - abra teardown syntax (for harness, §4.3):
abra app undeploy <d> -n,abra app volume remove <d> -f -n,abra app secret remove <d> --all -n. None take--chaos.
Open (defaults from §8, to confirm as reality lands)
- Deploy mechanism — SETTLED (M0):
nixos-rebuild switch --flake /root/cc-ci#cc-cirun on cc-ci itself, with the repo materialised on the host at/root/cc-ci. Chosen over--target-host/deploy-rs to avoid pushing large closures over the userspace-tailscaled SOCKS proxy (slow/fragile). Atomic rollback preserved by Nix generations (nixos-rebuild --rollback). The switch is launched as a detached transient systemd unit (systemd-run --unit=ccci-rebuild --collect) so it survives a momentary ssh-over-tailscale drop during activation. For the build loop the host copy is synced from the sandbox clone viatar | ssh(rsync absent on host); source of truth stays the git repo. D8/install.md will document the from-scratch path (clone repo on a fresh host, thennixos-rebuild switch --flake .#cc-ci).- nixpkgs pin: flake pins the exact rev cc-ci already ran (
50ab793…) so the first rebuild is a true no-op-then-base. Bump deliberately, never drift.
- nixpkgs pin: flake pins the exact rev cc-ci already ran (
- Webhook scope: default per-repo via enroll script.
- Drone runner type: default exec (must drive host abra).
- Secret tool — SETTLED (M0): sops-nix. cc-ci decrypts at activation using its ed25519 SSH
host key as the age identity (
sops.age.sshKeyPaths), so no extra key file to manage on the box. Recipients in/.sops.yaml: the host age key (age1h90ut…, from ssh-to-age) + an off-box master recovery key (age1cmk26t…; private half only at/srv/cc-ci/.sops/master-age.txton the build host, never in the repo) for re-keying if cc-ci is lost. Encrypt new secrets by writing plaintext intosecrets/<f>.yamlthensops -e -i(run inside the repo so.sops.yamlis found). - D10 recipe set: lock six early. Candidates favouring already-mirrored: custom-html (simple), cryptpad (stateful no-DB), keycloak (SSO/DB), matrix-synapse (DB+media), lasuite-docs (multi+S3), bluesky-pds (TLS-passthrough) — covers all five categories. Confirm during M4–M6.5.
Risks
- Disk — RESOLVED 2026-05-26. Original 8.9 GiB root had only ~3.8 GiB free and a hard
inode ceiling (586k total, ~6k free) — the flake's nixpkgs fetch (~50k files) hit ENOSPC on
inodes before bytes. Operator grew the VM to 28 GiB (22 GiB free, 1.78M inodes / 1.21M free);
the ext4 fs auto-resized (new block groups carry proportional inodes). Keep aggressive teardown +
periodic
docker image pruneto avoid regressing during M6.5 breadth.
Dead-ends
- (none yet)