Build #1 success (clone+hello on exec runner). Drone<->Gitea OAuth scripted as one-time bootstrap-drone-oauth.sh. M2 claimed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3.8 KiB
Installing cc-ci from scratch
WORK IN PROGRESS — grows with each milestone; the full from-scratch rebuild is verified at M9 (D8).
cc-ci is declared entirely as a NixOS flake (this repo). Bringing up the box is just
clone + nixos-rebuild switch + the operator preconditions — no manual post-steps. The proxy
(traefik) and Drone server are deployed by idempotent-reconcile systemd oneshots (modules/ proxy.nix, modules/drone.nix) that converge the swarm to the desired state on every activation
and boot (and self-heal drift), mirroring swarm-init. Target: a NixOS 24.11 host reachable as
cc-ci over SSH (root).
Operator preconditions (class-A1, see DECISIONS.md / docs/baseline.md)
- Wildcard TLS cert at
/var/lib/ci-certs/live/{fullchain.pem,privkey.pem}(*.ci.commoninternet.net+ci.commoninternet.net). Renewed out-of-band; never ACME here. - DNS:
*.ci.commoninternet.net(+ bare) → the gateway, which TLS-passthroughs (SNI) to cc-ci. - Firewall path: gateway reaches cc-ci on tcp/80+443 (opened by
modules/swarm.nix).
1. Apply the NixOS flake (this is the whole install)
The flake (flake.nix, hosts/cc-ci/, modules/) declares: base host, sops-nix (decrypts via the
host SSH key), Docker + single-node Swarm + the proxy overlay + firewall 80/443
(modules/swarm.nix), abra (modules/abra.nix / packages.nix), the traefik reconcile oneshot
(modules/proxy.nix), the Drone server reconcile oneshot (modules/drone.nix), and the
Drone exec runner (modules/drone-runner.nix).
# materialise the repo on the host (the build runs on cc-ci itself — see DECISIONS.md deploy mech)
# e.g. git clone <repo> /root/cc-ci (or sync it)
nixos-rebuild switch --flake /root/cc-ci#cc-ci
On activation, the reconcile oneshots (deploy-proxy, deploy-drone) run automatically and converge
the swarm. Verify:
systemctl is-system-running # -> running
docker info --format '{{.Swarm.LocalNodeState}}' # -> active
docker service ls # traefik (app+socket-proxy) + drone, all 1/1
systemctl is-active deploy-proxy deploy-drone drone-runner-exec # -> active x3
# wildcard cert served end-to-end via the gateway:
curl -ksv --resolve probe.ci.commoninternet.net:443:<gateway-ip> https://probe.ci.commoninternet.net/ \
2>&1 | grep -E 'subject:|HTTP/' # -> CN=*.ci.commoninternet.net, HTTP 404 (no app router yet)
curl -ks --resolve drone.ci.commoninternet.net:443:<gateway-ip> \
-o /dev/null -w '%{http_code}\n' https://drone.ci.commoninternet.net/healthz # -> 200
Tip: when driving the switch over an SSH session that rides Tailscale, run it as a detached unit so it survives a momentary drop, and use the absolute flake path (systemd units run with cwd
/):systemd-run --unit=ccci-sw --property=Type=oneshot nixos-rebuild switch --flake /root/cc-ci#cc-ci
2. One-time: link Drone ↔ Gitea (OAuth grant)
The only manual post-rebuild step. Drone needs the bot's Gitea OAuth token (granted by an
interactive login) before it can sync/clone repos; this can't be Nix-declared without putting the
bot password on the box. The token then persists in Drone's data volume.
GITEA_USERNAME=autonomic-bot GITEA_PASSWORD=… bash scripts/bootstrap-drone-oauth.sh
# -> "drone login ok (admin=true)" / "repo recipe-maintainers/cc-ci active=true"
Verify a build runs green: push any commit to the cc-ci repo and watch
https://drone.ci.commoninternet.net (or the API) — the push webhook (set on activation) triggers
the .drone.yml self-test on the exec runner.
3. (later milestones) comment-bridge, dashboard, recipe enrollment
See docs/enroll-recipe.md (D5), docs/secrets.md (D6), docs/runbook.md. Each new piece of infra
is added as another idempotent reconcile oneshot, so this install stays a single nixos-rebuild.