# Architecture cc-ci turns a `!testme` PR comment into a real end-to-end deploy + test of a Co-op Cloud recipe and reports the result back. Everything on the `cc-ci` host is declared in this repo's NixOS flake. ## Components | Component | Where | Role | |---|---|---| | **comment-bridge** | `bridge/bridge.py`, `modules/bridge.nix` (swarm svc, `ci.commoninternet.net/hook`) | Polls enrolled repos for `!testme` (primary, read-only) + optional admin webhook; authorizes the commenter (org membership); triggers a parameterized Drone build; posts/edits the PR comment with the run link + final pass/fail. | | **Drone server** | `modules/drone.nix` — coop-cloud `drone` recipe via abra (`drone.ci.commoninternet.net`, Gitea SSO) | CI engine. Holds the `recipe-ci` (custom-event) and `self-test` (push) pipelines (`.drone.yml`). | | **Drone exec runner** | `modules/drone-runner.nix` — host systemd service | Runs pipeline steps **on the host** so they can drive `abra`/Docker. `DRONE_RUNNER_CAPACITY=1` (MAX_TESTS) caps concurrent builds; the rest queue natively. | | **harness** | `runner/run_recipe_ci.py` + `runner/harness/` + `tests/` | Orchestrates per run: fetch recipe at the PR head → install → upgrade → backup/restore → recipe-local (D4) → guaranteed teardown. pytest + Playwright via the Nix `cc-ci-run` env. | | **swarm + traefik** | `modules/swarm.nix`, `modules/proxy.nix` — coop-cloud `traefik` recipe via abra | Single-node Docker Swarm + `proxy` overlay; traefik terminates TLS with the pre-issued wildcard cert (file provider, **no ACME**). The real deploy target for recipes-under-test. | | **backup-bot-two** | `modules/backupbot.nix` | restic-based volume/DB backups; `abra app backup/restore` drive it. | | **dashboard** | `dashboard/dashboard.py`, `modules/dashboard.nix` (`ci.commoninternet.net`) | YunoHost-CI-like overview: latest run per recipe + status badges + run links; `/badge/.svg`. | | **secrets** | `modules/secrets.nix` + `secrets/secrets.yaml` (sops-nix) | Infra secrets, decrypted at activation via the host SSH key as the age identity. See `secrets.md`. | All swarm infra (traefik, drone, bridge, dashboard, backupbot) is brought up by **idempotent-reconcile systemd oneshots** that converge on every activation/boot (no run-once sentinels) — so a from-scratch install is `git clone` + `nixos-rebuild switch` + the operator preconditions (`install.md`). ## The `!testme` flow ``` PR comment "!testme" │ (poll ≤30s, read-only; or optional admin webhook → /hook, HMAC-verified) ▼ comment-bridge: exact-match "!testme"? · commenter ∈ recipe-maintainers org? · resolve PR head ▼ Drone API: create build (event=custom, params RECIPE/REF/PR/SRC) ▼ recipe-ci pipeline (exec runner, on host): cc-ci-run runner/run_recipe_ci.py │ fetch recipe@PR-head (mirror clone + upstream version tags) → install → upgrade → backup │ → recipe-local (D4) → ALWAYS teardown (undeploy+volumes+secrets, verified) ▼ bridge watcher polls the build → edits the PR comment to ✅ passed / ❌ ▼ dashboard reflects latest-per-recipe status + badges ``` ## Network & TLS (see install.md §domain) `*.ci.commoninternet.net` (and bare `ci.commoninternet.net`) resolve to an operator **gateway** that **TLS-passthroughs** by SNI to cc-ci. cc-ci's traefik terminates TLS with the **pre-issued wildcard cert** at `/var/lib/ci-certs/live/` (no ACME, no DNS token on the box). Each run gets a unique short subdomain `-<6hex>.ci.commoninternet.net` (covered by the wildcard) so concurrent/serial runs never collide; it's torn down at run end. ## Resource safety (§4.2/§4.3) - **MAX_TESTS=1** (runner capacity) → at most one test app live; Drone queues the rest. - **Per-build timeout 60m** (Drone repo timeout) → a hung build is killed, freeing the slot. - **Guaranteed teardown** (`try/finally`) + a **run-start janitor** that reaps orphaned `*-`-scheme apps (backstop for a SIGKILL'd build). `CCCI_JANITOR_MAX_AGE=0` in the recipe-ci pipeline (safe at capacity=1). - Heavy recipes pull many images; keep registry creds configured + adequate disk (see `runbook.md`). ## Enrolling a recipe (D5, see enroll-recipe.md) Add `tests//` (recipe_meta.py + test_install/upgrade/backup.py) + the repo to the bridge `POLL_REPOS`. Per-recipe quirks go in `recipe_meta.py` (HEALTH_PATH/timeouts, `EXTRA_ENV` for e.g. cryptpad's SANDBOX_DOMAIN or lasuite's TIMEOUT) — **no shared-harness edits**.