Files
cc-ci-orchestrator/memory/cc-ci-host-rebuild-procedure.md

32 lines
2.0 KiB
Markdown

---
name: cc-ci-host-rebuild-procedure
description: How to nixos-rebuild the live cc-ci SERVER host (no self-service path exists; worked out 2026-06-13)
metadata:
node_type: memory
type: reference
originSessionId: 85355980-5e4f-4f90-b1ca-d0e4fe82f04b
---
The cc-ci **server** host (ssh alias `cc-ci`, Hetzner) has **no repo/script on it** to rebuild from
(`/root/cc-ci` does not exist; last operator rebuild was 2026-05-31). Orchestrator procedure to deploy
a nix change (e.g. a `deploy-proxy.service` / systemd-unit change that can't be applied at the docker
layer) — establishished during phase `pvfix`/`pxgate`:
1. Stage current `main` on the host: `rsync -a --delete /home/loops/work/cc-ci-fix/ root@cc-ci:/root/cc-ci-deploy/`
(orchestrator clone must be on the target ref + clean).
2. `ssh cc-ci 'chown -R root:root /root/cc-ci-deploy'` — else git FATAs "repository not owned by current user".
3. **Copy the operator-held sops secrets** (NOT in git): `cp /etc/cc-ci/secrets/secrets.yaml
/root/cc-ci-deploy/secrets/secrets.yaml`. Without it the build FATAs `secrets/secrets.yaml does not
exist` (sops module). The age key is at `/var/lib/sops-nix/key.txt`.
4. `rm -rf /root/cc-ci-deploy/.git` — a git flake only includes *tracked* files, so the untracked
secrets.yaml is excluded; dropping `.git` makes it a plain path flake that uses ALL files. (flake.nix
has no `self.rev` dependency, so this is safe.)
5. Build first: `cd /root/cc-ci-deploy && nixos-rebuild build --flake .#cc-ci` (target is `.#cc-ci` =
`.#cc-ci-hetzner` = `nix/hosts/cc-ci-hetzner/`). nixpkgs is PINNED to the running rev, so only the
changed cc-ci modules rebuild — small + fast, not a giant bump.
6. `nixos-rebuild switch --flake .#cc-ci`. Then verify: `systemctl is-active deploy-proxy`,
`systemctl --failed`, `docker service ls` all N/N, routed endpoints 200.
Operator must authorize (and pick a no-CI window) — a switch cycles reconcile oneshots
(deploy-proxy, warm-keycloak). A true from-scratch boot proof = reboot the host.