diff --git a/BACKLOG.md b/BACKLOG.md index a7a9d1c..34b780f 100644 --- a/BACKLOG.md +++ b/BACKLOG.md @@ -16,7 +16,9 @@ Two single-writer sections (§6.1): Builder edits only `## Build backlog`; Adver ### M1 — Swarm + abra target - [x] Docker + single-node swarm via Nix (modules/swarm.nix: docker + swarm-init oneshot + `proxy` overlay net + daily autoprune). Verified: Swarm=active, proxy overlay present. -- [ ] Traefik (file provider → /var/lib/ci-certs/live/) + per-run wildcard router +- [x] Traefik (file provider → /var/lib/ci-certs/live/) as a swarm stack on `proxy`; wildcard cert + served as default cert. Verified end-to-end: gateway 143.244.213.108:443 SNI-passthrough → + cc-ci Traefik terminates TLS w/ `CN=*.ci.commoninternet.net` (LE E8), HTTP 404 (no router yet). - [ ] abra installed; deploy + tear down a trivial recipe by hand over HTTPS - [ ] Gate: M1 — recipe reachable over HTTPS at *.ci.commoninternet.net, torn down clean diff --git a/JOURNAL.md b/JOURNAL.md index 2f332ca..44edc5b 100644 --- a/JOURNAL.md +++ b/JOURNAL.md @@ -120,3 +120,29 @@ attached to `proxy`. Then abra install + by-hand HTTPS deploy/teardown of a triv Rationale for swarm-service Traefik over a host `services.traefik`: a host process isn't on the `proxy` overlay, so it can't reach swarm service VIPs; coop-cloud recipes assume an on-`proxy` Traefik watching swarm labels. + +## 2026-05-26 — M1: Traefik swarm stack + HTTPS path proven + +**modules/traefik.nix:** Traefik v3.3 as a swarm service on `proxy` (so it reaches recipe VIPs). +Config via Nix `writeText` store files bind-mounted into the container (real files, not /etc +symlinks): static `traefik.yml` (entrypoints web/websecure; `providers.swarm` unix socket, +exposedByDefault=false, network=proxy; `providers.file` dir /etc/traefik/dynamic; ping; no +dashboard) and dynamic `certs.yml` (wildcard at /var/lib/ci-certs/live/* as `stores.default. +defaultCertificate` + certificates — so any *.ci.commoninternet.net router with tls=true is covered, +no ACME). Deployed by a `traefik-deploy` oneshot (`docker stack deploy`) after swarm-init. Opened +firewall 80/443 (gateway forwards over enp5s0). + +**Build + switch:** build EXIT 0; switch `Result=success`; `traefik-deploy` `Result=success`; +`docker service ls` → `traefik_traefik traefik:v3.3 1/1`. + +**Verify (commands + output):** +- Local: `curl -ksv -H 'Host: probe-test.ci.commoninternet.net' https://localhost/` → + `subject: CN=*.ci.commoninternet.net`, `issuer: …Let's Encrypt; CN=E8`, TLSv1.3, HTTP 404. +- **End-to-end via gateway:** `curl -ksv --resolve probe-test.ci.commoninternet.net:443:143.244.213.108 + https://probe-test.ci.commoninternet.net/` → `Connected to …(143.244.213.108) port 443`, + same wildcard cert, HTTP 404. Confirms gateway SNI-passthrough → cc-ci Traefik TLS termination. + 404 is correct (no router for that host yet). + +**Next:** install abra (M1 last task), `abra app new` a trivial recipe (custom-html) → deploy → +reach over HTTPS at .ci.commoninternet.net → teardown leaving no volumes. That completes M1 +→ CLAIM M1 gate. diff --git a/STATUS.md b/STATUS.md index 57992fa..276aa13 100644 --- a/STATUS.md +++ b/STATUS.md @@ -1,9 +1,9 @@ # STATUS — cc-ci Builder **Phase:** M0 → M1. M0 complete & CLAIMED; starting M1 (swarm + Traefik + abra) while awaiting verdict. -**In-flight:** M1 — Traefik next (file provider → /var/lib/ci-certs/live/, docker-swarm provider). -Docker + single-node swarm done. -**Last updated:** 2026-05-26 (M1 swarm up) +**In-flight:** M1 — abra install + by-hand HTTPS deploy/teardown of a trivial recipe (M1 gate). +Swarm + Traefik (wildcard cert via gateway passthrough) both up and verified. +**Last updated:** 2026-05-26 (M1 Traefik up, HTTPS path proven) ## Gates - **Gate: M0 — CLAIMED, awaiting Adversary** (2026-05-26). Evidence: flake rebuilds cc-ci from repo diff --git a/hosts/cc-ci/configuration.nix b/hosts/cc-ci/configuration.nix index 3a98d78..22b900f 100644 --- a/hosts/cc-ci/configuration.nix +++ b/hosts/cc-ci/configuration.nix @@ -7,6 +7,7 @@ ./hardware.nix ../../modules/secrets.nix ../../modules/swarm.nix + ../../modules/traefik.nix ]; # --- Tailscale (ACCESS-CRITICAL: do not break, this is the only route in) --- diff --git a/modules/traefik.nix b/modules/traefik.nix new file mode 100644 index 0000000..c700415 --- /dev/null +++ b/modules/traefik.nix @@ -0,0 +1,96 @@ +# Traefik for the test swarm (M1). Runs as a swarm service on the `proxy` overlay so it can +# reach recipe service VIPs (a host process couldn't). TLS terminates here using the operator's +# pre-issued wildcard cert via the file provider — NO ACME for commoninternet.net (§4.0). +# Recipe routers only need `traefik.enable=true` + a Host(...) rule + tls=true; the default +# certificate (the wildcard) is served for every *.ci.commoninternet.net host. +{ pkgs, ... }: +let + # Static config. Docker *Swarm* provider (v3) + file provider for the cert. + staticCfg = pkgs.writeText "traefik.yml" '' + entryPoints: + web: + address: ":80" + websecure: + address: ":443" + providers: + swarm: + endpoint: "unix:///var/run/docker.sock" + exposedByDefault: false + network: proxy + file: + directory: /etc/traefik/dynamic + watch: true + log: + level: INFO + accessLog: {} + api: + dashboard: false + ping: {} + ''; + + # Dynamic config: serve the pre-issued wildcard as the DEFAULT certificate, so any + # *.ci.commoninternet.net router with tls=true is covered without a cert resolver. + certsCfg = pkgs.writeText "certs.yml" '' + tls: + stores: + default: + defaultCertificate: + certFile: /var/lib/ci-certs/live/fullchain.pem + keyFile: /var/lib/ci-certs/live/privkey.pem + certificates: + - certFile: /var/lib/ci-certs/live/fullchain.pem + keyFile: /var/lib/ci-certs/live/privkey.pem + ''; + + stack = pkgs.writeText "traefik-stack.yml" '' + version: "3.8" + services: + traefik: + image: traefik:v3.3 + ports: + - target: 80 + published: 80 + mode: host + - target: 443 + published: 443 + mode: host + volumes: + - /var/run/docker.sock:/var/run/docker.sock:ro + - /var/lib/ci-certs/live:/var/lib/ci-certs/live:ro + - ${staticCfg}:/etc/traefik/traefik.yml:ro + - ${certsCfg}:/etc/traefik/dynamic/certs.yml:ro + networks: + - proxy + deploy: + mode: replicated + replicas: 1 + placement: + constraints: + - node.role == manager + restart_policy: + condition: any + networks: + proxy: + external: true + ''; +in +{ + # Gateway forwards 80/443 to cc-ci over the public interface (enp5s0), so open them. + networking.firewall.allowedTCPPorts = [ 80 443 ]; + + systemd.services.traefik-deploy = { + description = "Deploy the Traefik swarm stack"; + after = [ "swarm-init.service" ]; + requires = [ "swarm-init.service" ]; + wantedBy = [ "multi-user.target" ]; + path = [ pkgs.docker ]; + serviceConfig = { + Type = "oneshot"; + RemainAfterExit = true; + }; + script = '' + set -eu + docker stack deploy --detach=true -c ${stack} traefik + ''; + }; +}