# DECISIONS — cc-ci Builder Architecture decisions and dead-ends. One line of rationale each. (§0, §8) ## Settled - **Wildcard TLS:** operator pre-issues wildcard cert at `/var/lib/ci-certs/live/`; Traefik file provider serves it; **no ACME** for commoninternet.net. (Plan §4.0/§8 — fixed.) - **Repo:** `git.autonomic.zone/recipe-maintainers/cc-ci`, private. Bot is org admin. (Bootstrap.) - **Git credentials:** helper script in repo-local git config sources `/srv/cc-ci/.testenv` at call time — no secret values stored in `.git/config` or commits. - **Proxy: real coop-cloud/traefik via abra — SETTLED (M1, orchestrator decision 2026-05-26, overrides plan §3 `modules/traefik.nix`).** Instead of a hand-rolled Traefik we deploy the canonical Co-op Cloud `traefik` recipe via abra in **wildcard / file-provider mode**, for end-to-end fidelity (canonical `web`/`web-secure` entrypoints + proxy/swarm conventions every recipe expects — this also fixed an entrypoint-name mismatch the custom build hit). NO ACME, NO DNS token on the box: - `WILDCARDS_ENABLED=1` + append `compose.wildcard.yml`; the pre-issued cert is fed as the `ssl_cert`/`ssl_key` swarm secrets (v1) via `abra app secret insert … -f` from `/var/lib/ci-certs/live/{fullchain,privkey}.pem`. The file provider serves it (`tls.certificates`). - `LETS_ENCRYPT_ENV=` **empty** on the traefik app *and* on every test app → the recipe's `tls.certresolver=${LETS_ENCRYPT_ENV}` label resolves to no resolver → routers serve the wildcard via SNI from the file provider, ACME never fires. (Verified: 0 ACME log lines.) - Reproducibility (D8): `scripts/deploy-proxy.sh` is idempotent (ensures local abra server, fetches recipe, writes the wildcard/no-ACME env, inserts cert secrets, deploys). Documented in `docs/install.md`. The custom `modules/traefik.nix` was removed; `modules/swarm.nix` keeps swarm init + `proxy` net + firewall 80/443. - **Renewal (manual, ~90d):** operator re-issues the wildcard at the same paths, then `abra app secret rm traefik.ci.commoninternet.net ssl_cert -n` + re-insert at a new version (bump `SECRET_WILDCARD_CERT_VERSION`) and redeploy. (Documented in docs/secrets.md at M7.) - **abra teardown syntax** (for harness, §4.3): `abra app undeploy -n`, `abra app volume remove -f -n`, `abra app secret remove --all -n`. None take `--chaos`. ## Open (defaults from §8, to confirm as reality lands) - **Deploy mechanism — SETTLED (M0):** `nixos-rebuild switch --flake /root/cc-ci#cc-ci` run *on cc-ci itself*, with the repo materialised on the host at `/root/cc-ci`. Chosen over `--target-host`/deploy-rs to avoid pushing large closures over the userspace-tailscaled SOCKS proxy (slow/fragile). Atomic rollback preserved by Nix generations (`nixos-rebuild --rollback`). The switch is launched as a **detached transient systemd unit** (`systemd-run --unit=ccci-rebuild --collect`) so it survives a momentary ssh-over-tailscale drop during activation. For the build loop the host copy is synced from the sandbox clone via `tar | ssh` (rsync absent on host); source of truth stays the git repo. D8/install.md will document the from-scratch path (clone repo on a fresh host, then `nixos-rebuild switch --flake .#cc-ci`). - **nixpkgs pin:** flake pins the exact rev cc-ci already ran (`50ab793…`) so the first rebuild is a true no-op-then-base. Bump deliberately, never drift. - **Webhook scope:** default per-repo via enroll script. - **Drone runner type:** default exec (must drive host abra). - **Secret tool — SETTLED (M0):** sops-nix. cc-ci decrypts at activation using its **ed25519 SSH host key** as the age identity (`sops.age.sshKeyPaths`), so no extra key file to manage on the box. Recipients in `/.sops.yaml`: the host age key (`age1h90ut…`, from ssh-to-age) + an off-box **master recovery key** (`age1cmk26t…`; private half only at `/srv/cc-ci/.sops/master-age.txt` on the build host, never in the repo) for re-keying if cc-ci is lost. Encrypt new secrets by writing plaintext into `secrets/.yaml` then `sops -e -i` (run inside the repo so `.sops.yaml` is found). - **D10 recipe set:** lock six early. Candidates favouring already-mirrored: custom-html (simple), cryptpad (stateful no-DB), keycloak (SSO/DB), matrix-synapse (DB+media), lasuite-docs (multi+S3), bluesky-pds (TLS-passthrough) — covers all five categories. Confirm during M4–M6.5. ## Risks - **Disk — RESOLVED 2026-05-26.** Original 8.9 GiB root had only ~3.8 GiB free *and* a hard **inode** ceiling (586k total, ~6k free) — the flake's nixpkgs fetch (~50k files) hit ENOSPC on inodes before bytes. Operator grew the VM to **28 GiB** (22 GiB free, 1.78M inodes / 1.21M free); the ext4 fs auto-resized (new block groups carry proportional inodes). Keep aggressive teardown + periodic `docker image prune` to avoid regressing during M6.5 breadth. ## Dead-ends - (none yet)