12 KiB
REVIEW-1c.md — Adversary ledger for Phase 1c (Full reproducibility + genuine D8 live rebuild)
Phase plan: /srv/cc-ci/cc-ci-plan/plan-phase1c-full-reproducibility.md
Definition of Done: C1–C7 (each must be Adversary-verified cold within 24h before DONE).
- C1 — Secrets-repo split (
cc-ci-secretsprivate repo, secrets-only, consumed via flake input; base stays one well-parameterized repo;nixosConfigurations.cc-cistill byte-identical to running). - C2 — Cert in git (wildcard cert+key are sops secrets in
cc-ci-secrets, decrypted at activation; "operator drops a cert file" step gone; rebuild serves valid TLS from git-sourced cert). - C3 — All secrets in git, one exception (only out-of-band secret = bootstrap age key; everything else sops-encrypted in git).
- C4 — Genuine throwaway-VM live rebuild (blank NixOS VM in
terraform-ci, only bootstrap age key provisioned; clone base+secrets,nixos-rebuild switch, oneshots converge, cert+secrets decrypt, no manual step outsidedocs/install.md; Adversary performs cold). - C5 — Honest D8 (evidence rewritten: static byte-identical closure + live throwaway rebuild; "infeasible by design" removed; any limitation narrow + Adversary-signed-off).
- C6 — Resource fit + cleanup (
cc-nix-test6→4 GB; throwaway VM at 4 GB; ≤~12 GB running guideline; throwaway destroyed after test; final sizing recorded in DECISIONS.md). - C7 — Docs (install.md/secrets.md/architecture.md + plan refs updated to new model; fresh engineer can stand up an instance).
Mapping to method milestones: W1→C6(headroom), W2→C1/C2/C3, W3→C4(VM), W4→C4(rebuild), W5→C4/C5(cold proof+honest D8), W6→C6/C7(cleanup+docs).
Standing rules: verify every claim from a COLD START (fresh shell, own clone, no cached state). Re-run the acceptance check myself. Veto power: ## VETO <reason> forbids DONE until cleared.
Cold-start baseline @2026-05-27 (Phase 1c kickoff)
Adversary loop entered. Observations from cold start:
git pull --rebase→ up to date @492fa23(Phase-1 DONE sign-off). No Phase-1c state files yet (STATUS-1c.md / BACKLOG-1c.md / JOURNAL-1c.md absent) — Builder has not begun 1c bootstrap. Nothing CLAIMED.ssh cc-ci 'hostname && systemctl is-system-running'→nixos/running(healthy, pre-refactor baseline).- SOCKS proxy
127.0.0.1:1055andssh cc-ciworking. Incus skill present at/srv/incus-terraform-nix-vm-creator/skills/incus-terraform/SKILL.md.
No gates to verify yet. Idling until the Builder seeds 1c state and claims the first gate (watchdog will ping on CLAIM). Will keep break-it probes ready (greps for plaintext secrets in base + store; cert-in-git decrypt path; byte-identical drift; throwaway-VM rebuild cold-repro).
Pre-W2 cold baselines @2026-05-27 16:10Z (reference values for verifying C1/C2/C3 after W2)
Builder has bootstrapped 1c state; W2 in flight, not yet CLAIMED. Decisions recorded by Builder (DECISIONS.md): secrets linkage = git submodule (deviates from flake-input default — rationale: no private-repo fetch cred at nix-eval, keeps defaultSopsFile a local path = minimal change + trivially byte-identical); bootstrap key for throwaway = recovery age key via sops.age.keyFile.
Reference values to compare against after W2:
- C1 byte-identical — running system toplevel:
/nix/store/m1pdvbhlmlj3x3gn0x83rgwcgssks7qs-nixos-system-nixos-24.11.20250630.50ab793(booted:09ia5qd0jw0nghx83b4fijcg2jak9cp4-…). nixos-version24.11.20250630.50ab793 (Vicuna). After the refactor,nixos-rebuild build .#cc-cimust produce the same toplevel (pure structural move ⇒ identical closure). - C2 cert content — out-of-band cert at
cc-ci:/var/lib/ci-certs/live/:fullchain.pem2909 B sha256c1d96d61a43bfec10716e18d13832bd325ef173e9af01f197a48490481300080;privkey.pem227 B sha2569ec25d00910677718762713717b8c763da46fa7489e292b057e916a252d0ca42(EC key). After W2 these must be sops-decrypted from git to the same path with the same hashes, and the operator-cert-drop precondition framing in proxy.nix must be gone. - C3 no-plaintext — base repo clean:
secrets/secrets.yamlis sopsENC[AES256_GCM,…];git grepforBEGIN … PRIVATE KEY|BEGIN CERTIFICATEoutsidesecrets/= 0 matches; no*.pem/*.key/*.crt/*.p12/*.pfxtracked. After W2: cert+key must beENC[…]incc-ci-secrets, never plaintext; base must stay clean; also grep the Nix store for decrypted secret material at activation.
Things to scrutinize hard when W2 is CLAIMED:
- Submodule actually points at a private
recipe-maintainers/cc-ci-secretsholding only encrypted secrets (no code/config logic). - Byte-identical: same toplevel store path (or differences are only expected & explained — zero functional drift).
- Cert genuinely served from the git-sourced cert after switch (live TLS handshake on a
*.ci.commoninternet.nethost), not the stale out-of-band file. - All D1–D10 still hold after the refactor (no regression) — spot-check the live system health + a
!testme-path sanity check before DONE.
Interim probe @2026-05-27 16:22Z — cc-ci-secrets repo (pre-W2-gate; not a gate verdict)
Independent cold check of the new secrets repo (Builder W2 step 1, commit f972bc1), via Gitea API with bot creds:
recipe-maintainers/cc-ci-secretsexists,private: True, non-empty. Top-level:.sops.yaml,README.md,secrets.yaml(no code / no config logic — matches §2's "encrypted secrets only"; README is doc-only and leak-clean).secrets.yaml: all 8 keysENC[...]— 6 infra (test_secret, drone_rpc_secret, drone_gitea_client_secret, bridge_drone_token, bridge_gitea_token, bridge_webhook_hmac) +wildcard_cert+wildcard_key. 0 plaintext PEM/cert markers; sopsmacmetadata present. → cert+key genuinely moved into sops-in-git (C2/C3 secrets-side looks good).- Layout nuance: secrets file is at repo root
secrets.yaml; Builder will mount the submodule at basesecrets/so it resolves tosecrets/secrets.yaml. OK for the submodule linkage.
Not yet verifiable (needs W2 base-switch + activation): byte-identical build==running (C1), cert sops-decrypts to the same hashes at /var/lib/ci-certs/live/ (C2 — must match fullchain c1d96d61…, privkey 9ec25d00…), no plaintext leak into the Nix store, live TLS from git-cert, and no D1–D10 regression. Will run these when Gate W2 is CLAIMED.
W2: PASS @2026-05-27 16:55Z — secrets-split + cert-in-git (verifies C1, C2, C3) — COLD
Gate W2 CLAIMED by Builder (commits f972bc1/f79e542/faa3709; running toplevel vh6vwxbl…). Verified independently from a cold start (fresh clone on cc-ci, own checks, no reliance on the Builder's /root/cc-ci):
(1) Byte-identical build==running (C1) — PASS. Fresh recursive clone of origin/main (HEAD 0633aa7) on cc-ci into /tmp/advverify, submodule secrets→2312f1c initialized with bot creds (via http.extraheader, not URL/args), secrets/secrets.yaml present + ENC[…]. nixos-rebuild build --flake 'git+file:///tmp/advverify?submodules=1#cc-ci' → /nix/store/vh6vwxbl4qr9whzpwgjimhf9gn4329p8-nixos-system-… == /run/current-system (readlink -f identical). Zero drift — the currently published repo+submodule reproduces the currently running system byte-for-byte. Base stays one parameterized repo; only secrets/ is the external private submodule.
(2) Cert in git + live TLS (C2) — PASS. /var/lib/ci-certs/live/{fullchain.pem,privkey.pem} are now symlinks → /run/secrets/wildcard_cert,wildcard_key (sops-decrypted at activation), not out-of-band files. File sha256 c1d96d61…/9ec25d00… == my pre-W2 operator-cert baseline (byte-identical cert, now git-sourced). secrets.nix adds wildcard_cert(0444)/wildcard_key(0400) with a comment that this "Replaces the prior operator-drops-a-cert-file step." Live HTTPS https://ci.commoninternet.net via proxy → http_code=200, ssl_verify_result=0, served leaf = LE *.ci.commoninternet.net (SAN *.ci+bare), valid 2026-05-26→08-24. Served leaf fingerprint 57:8D:67:9E:FE:89:…:B8:A6 == the git-sourced cert's leaf fingerprint (computed locally from the decrypted file) → live TLS provably served from the git cert, full chain of custody intact.
(3) No plaintext leak (C3) — PASS. Base repo: secrets/ is a gitlink (.gitmodules→ private cc-ci-secrets); no *.pem/*.key tracked; git grep BEGIN…PRIVATE KEY|CERTIFICATE outside REVIEW text = 0. cc-ci-secrets: all 8 secrets ENC[…] (6 infra + cert + key), 0 plaintext PEM, valid sops MAC, private repo. On the host: secrets decrypt to /run/secrets.d (ramfs, in-memory), not the world-readable store; no private key found in the system-closure store dirs.
Non-regression: systemctl is-system-running=running, 0 failed units; swarm stack all 1/1 (traefik v3.6.15, drone 2.26.0, ccci-bridge, ccci-dashboard, backups), drone-runner-exec running; reconcile oneshots converged. No D1–D10 regression observed.
→ C1, C2, C3 Adversary-PASS (24h freshness clock starts now; will be re-exercised on the blank host at C4). Remaining for DONE: C4 (genuine throwaway-VM live rebuild), C5 (honest D8), C6 (resize+cleanup), C7 (docs). No VETO.
Corroboration @2026-05-27 17:23Z — sops cert re-decrypts at BOOT (after W1 resize-reboot)
W1 (Builder, 6c03a27) resized cc-nix-test 6→4 GB and rebooted the live server. Cold spot-check post-reboot: system running, 0 failed, mem 3575 MB (≈4 GB applied), live TLS http_code=200 ssl_verify=0. Cert symlink target moved /run/secrets.d/8/ → /1/ (ramfs wiped on reboot) but fullchain.pem sha256 still c1d96d61…. → the git-sourced sops cert re-decrypts byte-identically at boot, not only at switch — strengthens C2 (reproducible from git across a cold boot). No formal gate (W1 has no Adversary gate); W4 = next gate. Builder W3 DONE: throwaway VM reachable 100.126.124.86.
C4/W5 verification standard (set @2026-05-27 17:30Z — read before claiming W4)
My cold proof of the throwaway-VM live rebuild (C4) will require, and I will REJECT a skipped/faked TLS check:
- Rebuilt VM keeps
DOMAIN = ci.commoninternet.net(same instance ⇒ proves the SAME system reproduces). The git cert only covers*.ci.commoninternet.net+ bare — do NOT use aci2.commoninternet.netdomain (no*.ci2cert ⇒ TLS unverifiable / would be a fake pass). - Fresh VM has a NEW tailnet IP; public DNS for
*.ci.commoninternet.net→ gateway → the real cc-ci, not the fresh VM. So verify TLS on the fresh VM itself, forcing resolution to the VM:curl --resolve <host>.ci.commoninternet.net:443:127.0.0.1(or to the VM's tailnet IP), SNIci.commoninternet.net. - Served leaf fingerprint must == the git cert leaf
57:8D:67:9E:FE:89:…:B8:A6(sha256), proving Traefik on the rebuilt host serves the sops-from-git cert. Cert-from-git serving is an integral part of the C4/D8 proof. - Plus: oneshots converge (swarm/proxy/drone/bridge/dashboard), all secrets decrypt, no manual step outside
docs/install.md, only the bootstrap age key provisioned out-of-band.
C1 refresh @2026-05-27 18:00Z — byte-identical against NEW keyFile config (izsmiajw)
Builder W4 Step A (9cc6788/24fe11a) added sops.age.keyFile (recovery key on clones, host-derived on cc-ci) and switched cc-ci → new toplevel izsmiajwjwa12356mm35fw08jdy5f0zs (supersedes the vh6vwxbl from my 16:55 W2 PASS). Re-verified cold: fresh recursive clone (HEAD 24fe11a, submodule 2312f1c) → nixos-rebuild build = izsmiajw == /run/current-system. BYTE-IDENTICAL: YES, zero drift. Live host healthy (running, 0 failed), cert sha c1d96d61…, TLS 200/ssl_verify=0. → C1 stays Adversary-PASS against the current running config; clock refreshed 18:00Z. (W4 Step B throwaway rebuild still in flight — not yet CLAIMED.)