Adds terraform/ (hcloud provider, cpx32/nbg1/debian-12) and a new
nix/hosts/cc-ci-hetzner/ flake host to provision the cc-ci server on
Hetzner Cloud as an alternative to the Incus cc-nix-test VM.
Stage 1 (Terraform): creates a cpx32 server (4 vCPU / 8 GB / x86 AMD,
Nuremberg), runs nixos-infect (pinned rev 40f62a6, 2026-03-22) to convert
Debian 12 → NixOS 24.11, and reboots into bare NixOS.
Stage 2 (manual, per terraform/README.md): clone cc-ci --recursive,
provision the bootstrap age key, then `nixos-rebuild switch --flake
.#cc-ci-hetzner`.
Verified (throwaway run 2026-05-31, server 134464512, 168.119.126.100):
- terraform apply: cpx32 in nbg1 created in 17 s
- nixos-infect: NixOS 24.11.719113.50ab793786d9 (same nixpkgs pin as flake)
- nixos-rebuild build --flake .#cc-ci-hetzner: exit 0 on server
(131 derivations; all cc-ci modules: tailscale, drone, drone-runner,
bridge, dashboard, harness, swarm, abra, proxy, secrets)
- terraform plan: no changes (idempotent)
- terraform destroy: server + SSH key removed
Age key step (plan §4 Stage 2): operator-pending. Full switch/convergence
requires bootstrap age key at /var/lib/sops-nix/key.txt. Flake builds
without it; activation needs it.
No secrets committed: HCLOUD_TOKEN via env, tfstate gitignored,
networking.nix contains throwaway IP (update per README for production).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
immich's compose bind-mounts the host /etc/localtime into the app container; NixOS without a set
timezone leaves /etc/localtime absent → 'bind source path does not exist: /etc/localtime' → app
service rejected (never converges). time.timeZone=UTC creates /etc/localtime (UTC = deterministic CI
timestamps). Nix-declared, reversible; helps any recipe binding /etc/localtime.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The committed module used systemd.services.docker-prune, which conflicts with the NixOS docker
module's own docker-prune unit (`nixos-rebuild build` error: conflicting definition values). The
deployed+verified host already runs ci-docker-prune; this syncs the repo so a cold build matches.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Removes virtualisation.docker.autoPrune (daily `docker system prune --all` evicted in-use base
images → cold re-pull → Hub rate-limit churn, JOURNAL-2). Adds modules/docker-prune.nix: daily
timer + oneshot that prunes only dangling+until=24h, gated on disk pressure (>=80%) AND no run-app
live AND no swarm service converging; never --all, never --volumes. Teardown unchanged (never
removes images). Registry pull-through cache dropped per operator scope correction.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
canonical.enrolled_recipes; runner/nightly_sweep.py (roll keycloak+traefik →
serial full-cold over enrolled on latest → green promotes; skip if test active;
operate against CCCI_REPO checkout for tests/); nix/modules/nightly-sweep.nix
(timer 03:00 Persistent + oneshot service) wired in. 2 bugs fixed via live
service run (repo-relative enrolled scan; util-linux for backup PTY). Live
SERVICE sweep: enrolled=['custom-html'] → all tiers green → canonical advanced
1.10.0→1.11.0; red-run correctly does NOT promote. 71 unit pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The autoPrune flags passed '--volumes' WITH '--filter until=24h', which docker
rejects ('until filter not supported with --volumes') — so docker-prune.service
FAILED every day (system 'degraded') and never reclaimed anything (a cause of the
disk creeping to 96%). Worse, '--volumes' prunes volumes with no running
container — which would DELETE Phase-2w DATA-WARM canonical volumes (undeployed by
design). Removed '--volumes': now prunes images/containers/networks/build-cache
older than 24h only; warm volumes survive and are pruned deliberately by the warm
reconcilers (WC8).
Verified: nixos-rebuild switch -> docker-prune.service runs clean, system
'running' (0 failed units), warm keycloak still 200.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- set_env: ensure trailing newline before append (keycloak .env.sample ends
with a newline-less #COMPOSE_FILE comment, so a bare append glued DOMAIN onto
it -> DOMAIN unset -> KC_HOSTNAME=https:// -> crash-loop). Same bite fixed in
backupbot.nix.
- converge skips the (forced) redeploy when keycloak already serves 200, so an
activation/boot is a true no-op (no JVM-restart blip) and only redeploys when
down/crash-looping. Health-wait extended to 15min.
Verified on cc-ci: nixos-rebuild switch -> warm-keycloak.service active,
'no-op converge', system running (0 failed), /realms/master=200.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
nix/modules/warm-keycloak.nix: idempotent systemd oneshot (like deploy-proxy)
that converges a live-warm shared keycloak at warm-keycloak.ci.commoninternet.net
pinned to 10.7.1+26.6.2, secrets generated only-if-missing (never
rotate a live provider), waits /realms/master=200. Re-warmable from scratch
(D8/WC8). Wired into hosts/cc-ci/configuration.nix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hedgedoc mirrored to recipe-maintainers/hedgedoc with probe PR #1; add it to the bridge poll list so
!testme triggers the full generic suite (no cc-ci/repo-local overlay -> pure generic). Rebuild pending.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
flake.nix/flake.lock STAY at root so the build ref #cc-ci is unchanged; only flake's internal
configuration.nix path updated. Root-relative refs inside moved modules re-based ../X -> ../../X
(secrets/bridge/dashboard); configuration.nix's ../../modules imports unchanged (both dirs under nix/).
Living docs (README, architecture/install/secrets/enroll) + .drone.yml comment updated to nix/...;
append-only history logs left as-is. DECISIONS.md records RL5 + the deferred-coordinated RL6.
Verified on cc-ci: nixos-rebuild build 'path:#cc-ci' -> toplevel 8i3jcad9 (BYTE-IDENTICAL to the
pre-move build — store derivations are content-addressed on file contents, module .nix not in the
runtime closure); scripts/lint.sh -> lint: PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>