Files
cc-ci/docs/install.md
autonomic-bot a385148af9 M2: Drone server + exec runner up; infra as idempotent-reconcile oneshots
Convert proxy+drone bring-up to writeShellApplication systemd oneshots that
reconcile every activation (orchestrator steer). pkgs.abra overlay. Runner
connected via RPC (polling, capacity=2). install.md = clone + nixos-rebuild switch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 22:59:59 +01:00

3.1 KiB

Installing cc-ci from scratch

WORK IN PROGRESS — grows with each milestone; the full from-scratch rebuild is verified at M9 (D8).

cc-ci is declared entirely as a NixOS flake (this repo). Bringing up the box is just clone + nixos-rebuild switch + the operator preconditions — no manual post-steps. The proxy (traefik) and Drone server are deployed by idempotent-reconcile systemd oneshots (modules/ proxy.nix, modules/drone.nix) that converge the swarm to the desired state on every activation and boot (and self-heal drift), mirroring swarm-init. Target: a NixOS 24.11 host reachable as cc-ci over SSH (root).

Operator preconditions (class-A1, see DECISIONS.md / docs/baseline.md)

  • Wildcard TLS cert at /var/lib/ci-certs/live/{fullchain.pem,privkey.pem} (*.ci.commoninternet.net + ci.commoninternet.net). Renewed out-of-band; never ACME here.
  • DNS: *.ci.commoninternet.net (+ bare) → the gateway, which TLS-passthroughs (SNI) to cc-ci.
  • Firewall path: gateway reaches cc-ci on tcp/80+443 (opened by modules/swarm.nix).

1. Apply the NixOS flake (this is the whole install)

The flake (flake.nix, hosts/cc-ci/, modules/) declares: base host, sops-nix (decrypts via the host SSH key), Docker + single-node Swarm + the proxy overlay + firewall 80/443 (modules/swarm.nix), abra (modules/abra.nix / packages.nix), the traefik reconcile oneshot (modules/proxy.nix), the Drone server reconcile oneshot (modules/drone.nix), and the Drone exec runner (modules/drone-runner.nix).

# materialise the repo on the host (the build runs on cc-ci itself — see DECISIONS.md deploy mech)
#   e.g. git clone <repo> /root/cc-ci   (or sync it)
nixos-rebuild switch --flake /root/cc-ci#cc-ci

On activation, the reconcile oneshots (deploy-proxy, deploy-drone) run automatically and converge the swarm. Verify:

systemctl is-system-running                          # -> running
docker info --format '{{.Swarm.LocalNodeState}}'     # -> active
docker service ls                                    # traefik (app+socket-proxy) + drone, all 1/1
systemctl is-active deploy-proxy deploy-drone drone-runner-exec   # -> active x3
# wildcard cert served end-to-end via the gateway:
curl -ksv --resolve probe.ci.commoninternet.net:443:<gateway-ip> https://probe.ci.commoninternet.net/ \
  2>&1 | grep -E 'subject:|HTTP/'    # -> CN=*.ci.commoninternet.net, HTTP 404 (no app router yet)
curl -ks --resolve drone.ci.commoninternet.net:443:<gateway-ip> \
  -o /dev/null -w '%{http_code}\n' https://drone.ci.commoninternet.net/healthz   # -> 200

Tip: when driving the switch over an SSH session that rides Tailscale, run it as a detached unit so it survives a momentary drop, and use the absolute flake path (systemd units run with cwd /): systemd-run --unit=ccci-sw --property=Type=oneshot nixos-rebuild switch --flake /root/cc-ci#cc-ci

2. (later milestones) comment-bridge, dashboard, recipe enrollment

See docs/enroll-recipe.md (D5), docs/secrets.md (D6), docs/runbook.md. Each new piece of infra is added as another idempotent reconcile oneshot, so this install stays a single nixos-rebuild.