M1: Traefik swarm stack (wildcard cert via file provider); HTTPS path proven E2E

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-26 21:55:08 +01:00
parent ab839ae61d
commit 51b18841bc
5 changed files with 129 additions and 4 deletions

View File

@ -16,7 +16,9 @@ Two single-writer sections (§6.1): Builder edits only `## Build backlog`; Adver
### M1 — Swarm + abra target
- [x] Docker + single-node swarm via Nix (modules/swarm.nix: docker + swarm-init oneshot + `proxy`
overlay net + daily autoprune). Verified: Swarm=active, proxy overlay present.
- [ ] Traefik (file provider → /var/lib/ci-certs/live/) + per-run wildcard router
- [x] Traefik (file provider → /var/lib/ci-certs/live/) as a swarm stack on `proxy`; wildcard cert
served as default cert. Verified end-to-end: gateway 143.244.213.108:443 SNI-passthrough →
cc-ci Traefik terminates TLS w/ `CN=*.ci.commoninternet.net` (LE E8), HTTP 404 (no router yet).
- [ ] abra installed; deploy + tear down a trivial recipe by hand over HTTPS
- [ ] Gate: M1 — recipe reachable over HTTPS at *.ci.commoninternet.net, torn down clean

View File

@ -120,3 +120,29 @@ attached to `proxy`. Then abra install + by-hand HTTPS deploy/teardown of a triv
Rationale for swarm-service Traefik over a host `services.traefik`: a host process isn't on the
`proxy` overlay, so it can't reach swarm service VIPs; coop-cloud recipes assume an on-`proxy`
Traefik watching swarm labels.
## 2026-05-26 — M1: Traefik swarm stack + HTTPS path proven
**modules/traefik.nix:** Traefik v3.3 as a swarm service on `proxy` (so it reaches recipe VIPs).
Config via Nix `writeText` store files bind-mounted into the container (real files, not /etc
symlinks): static `traefik.yml` (entrypoints web/websecure; `providers.swarm` unix socket,
exposedByDefault=false, network=proxy; `providers.file` dir /etc/traefik/dynamic; ping; no
dashboard) and dynamic `certs.yml` (wildcard at /var/lib/ci-certs/live/* as `stores.default.
defaultCertificate` + certificates — so any *.ci.commoninternet.net router with tls=true is covered,
no ACME). Deployed by a `traefik-deploy` oneshot (`docker stack deploy`) after swarm-init. Opened
firewall 80/443 (gateway forwards over enp5s0).
**Build + switch:** build EXIT 0; switch `Result=success`; `traefik-deploy` `Result=success`;
`docker service ls` → `traefik_traefik traefik:v3.3 1/1`.
**Verify (commands + output):**
- Local: `curl -ksv -H 'Host: probe-test.ci.commoninternet.net' https://localhost/` →
`subject: CN=*.ci.commoninternet.net`, `issuer: …Let's Encrypt; CN=E8`, TLSv1.3, HTTP 404.
- **End-to-end via gateway:** `curl -ksv --resolve probe-test.ci.commoninternet.net:443:143.244.213.108
https://probe-test.ci.commoninternet.net/` → `Connected to …(143.244.213.108) port 443`,
same wildcard cert, HTTP 404. Confirms gateway SNI-passthrough → cc-ci Traefik TLS termination.
404 is correct (no router for that host yet).
**Next:** install abra (M1 last task), `abra app new` a trivial recipe (custom-html) → deploy →
reach over HTTPS at <app>.ci.commoninternet.net → teardown leaving no volumes. That completes M1
→ CLAIM M1 gate.

View File

@ -1,9 +1,9 @@
# STATUS — cc-ci Builder
**Phase:** M0 → M1. M0 complete & CLAIMED; starting M1 (swarm + Traefik + abra) while awaiting verdict.
**In-flight:** M1 — Traefik next (file provider → /var/lib/ci-certs/live/, docker-swarm provider).
Docker + single-node swarm done.
**Last updated:** 2026-05-26 (M1 swarm up)
**In-flight:** M1 — abra install + by-hand HTTPS deploy/teardown of a trivial recipe (M1 gate).
Swarm + Traefik (wildcard cert via gateway passthrough) both up and verified.
**Last updated:** 2026-05-26 (M1 Traefik up, HTTPS path proven)
## Gates
- **Gate: M0 — CLAIMED, awaiting Adversary** (2026-05-26). Evidence: flake rebuilds cc-ci from repo

View File

@ -7,6 +7,7 @@
./hardware.nix
../../modules/secrets.nix
../../modules/swarm.nix
../../modules/traefik.nix
];
# --- Tailscale (ACCESS-CRITICAL: do not break, this is the only route in) ---

96
modules/traefik.nix Normal file
View File

@ -0,0 +1,96 @@
# Traefik for the test swarm (M1). Runs as a swarm service on the `proxy` overlay so it can
# reach recipe service VIPs (a host process couldn't). TLS terminates here using the operator's
# pre-issued wildcard cert via the file provider — NO ACME for commoninternet.net (§4.0).
# Recipe routers only need `traefik.enable=true` + a Host(...) rule + tls=true; the default
# certificate (the wildcard) is served for every *.ci.commoninternet.net host.
{ pkgs, ... }:
let
# Static config. Docker *Swarm* provider (v3) + file provider for the cert.
staticCfg = pkgs.writeText "traefik.yml" ''
entryPoints:
web:
address: ":80"
websecure:
address: ":443"
providers:
swarm:
endpoint: "unix:///var/run/docker.sock"
exposedByDefault: false
network: proxy
file:
directory: /etc/traefik/dynamic
watch: true
log:
level: INFO
accessLog: {}
api:
dashboard: false
ping: {}
'';
# Dynamic config: serve the pre-issued wildcard as the DEFAULT certificate, so any
# *.ci.commoninternet.net router with tls=true is covered without a cert resolver.
certsCfg = pkgs.writeText "certs.yml" ''
tls:
stores:
default:
defaultCertificate:
certFile: /var/lib/ci-certs/live/fullchain.pem
keyFile: /var/lib/ci-certs/live/privkey.pem
certificates:
- certFile: /var/lib/ci-certs/live/fullchain.pem
keyFile: /var/lib/ci-certs/live/privkey.pem
'';
stack = pkgs.writeText "traefik-stack.yml" ''
version: "3.8"
services:
traefik:
image: traefik:v3.3
ports:
- target: 80
published: 80
mode: host
- target: 443
published: 443
mode: host
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /var/lib/ci-certs/live:/var/lib/ci-certs/live:ro
- ${staticCfg}:/etc/traefik/traefik.yml:ro
- ${certsCfg}:/etc/traefik/dynamic/certs.yml:ro
networks:
- proxy
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
restart_policy:
condition: any
networks:
proxy:
external: true
'';
in
{
# Gateway forwards 80/443 to cc-ci over the public interface (enp5s0), so open them.
networking.firewall.allowedTCPPorts = [ 80 443 ];
systemd.services.traefik-deploy = {
description = "Deploy the Traefik swarm stack";
after = [ "swarm-init.service" ];
requires = [ "swarm-init.service" ];
wantedBy = [ "multi-user.target" ];
path = [ pkgs.docker ];
serviceConfig = {
Type = "oneshot";
RemainAfterExit = true;
};
script = ''
set -eu
docker stack deploy --detach=true -c ${stack} traefik
'';
};
}