1c/W3: throwaway VM created (booting); W4 design notes (keyFile/recovery-key, tailnet, bridge)
All checks were successful
continuous-integration/drone/push Build is passing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-27 17:06:23 +01:00
parent 6c03a27b16
commit dc81c16b9d

View File

@ -139,3 +139,31 @@ PATCH `limits.memory=4GB` (http 200) → PUT state start (op Success, Running).
Running RAM now: cc-nix-test 4 + lichen-staging 4 = 8 GB; throwaway 4 → 12 GB ≤ 16 physical (guideline OK).
**Next: W3** — create blank 4 GB NixOS VM in terraform-ci, provision ONLY the bootstrap (recovery) age key.
## 2026-05-27 — W3: throwaway VM created (booting) + W4 design notes
**W3:** Created `ccci-throwaway` in terraform-ci via the **Incus REST API** (curl through the 1055
proxy — terraform/nix absent on sandbox; replicated `projects/incus-base/main.tf`): image
`incus-base-vm` (fp 3a0c4160), 4 GB RAM / 2 cpu / **20 GB disk** (>10 GB default, to dodge cc-ci's old
ENOSPC), cloud-init writes /etc/nixos/{configuration,incus-base}.nix + setup.sh + /etc/ts-auth-key
(incus workspace reusable key) + /etc/ts-hostname=ccci-throwaway; runcmd setup.sh (nix-channel
nixos-24.11, `nixos-rebuild boot`, sysrq reboot → tailscale auto-joins). ssh_authorized_keys = vm_ssh_key
(I hold private) + mfowler + cc-ci-root key. CREATE+START ops Success, status Running; first boot ~4-6 min.
NOTE: cc-nix-test was terraform-created (`projects/cc-nix-test`); my W1 API resize drifts its tfstate
(reconcile or accept in W6 final-sizing).
**W4 design (analysis; implement next):**
- cc-ci's `hosts/cc-ci/configuration.nix` pins tailscale `--hostname=cc-nix-test` + reads /etc/ts-auth-key,
and `secrets.nix` decrypts ONLY via `age.sshKeyPaths` (host SSH key). Consequences for the throwaway:
1. **Decryption:** throwaway's host SSH key is NOT a sops recipient → cc-ci config as-is can't decrypt
there. **W4 must add `sops.age.keyFile = "/var/lib/sops-nix/key.txt"`** and provision the **recovery
age key** there (the ONE out-of-band secret). Open Q: does a *missing* keyFile abort activation on
cc-ci (where the file won't exist)? If yes, also provision cc-ci's own host-derived age key at that
path (no new exposure) OR keep sshKeyPaths+keyFile and confirm sops-nix tolerates the absence.
Test path: add keyFile, deploy to cc-ci (rollback-safe via generations), observe.
2. **Tailnet hostname:** after rebuild the throwaway re-ups as `cc-nix-test` → tailscale auto-suffixes
the duplicate; the REAL cc-ci is accessed by IP (100.90.116.4) so it's unaffected. Verify the
throwaway via its own IP (Incus state tailscale0 addr) and/or incus-agent `exec` (hostname-independent).
3. **Bridge side effect:** throwaway's bridge would poll Gitea with the real token (fresh state ⇒ could
re-trigger already-`!testme`'d PRs). Mitigate: run W4 when no `!testme` is pending; destroy promptly.
- Adding keyFile changes the closure again (W2 byte-identical was at `vh6vwxbl`); re-verify after.